Methods For Producing Secreted Polypeptides

ABSTRACT

The present invention relates to methods for producing a polypeptide, comprising: (a) cultivating a fungal host cell in a medium conducive for the production of the polypeptide, wherein the fungal host cell comprises a nucleic acid construct comprising a first nucleotide sequence encoding a signal peptide operably linked to a second nucleotide sequence encoding the polypeptide, wherein the first nucleotide sequence is foreign to the second nucleotide sequence and the 3′ end of the first nucleotide sequence is immediately upstream of the initiator codon of the second nucleotide sequence. The present invention also relates to the isolated signal peptide sequences and to constructs, vectors, and fungal host cells comprising the signal peptide sequences operably linked to nucleotide sequences encoding polypeptides.

CROSS-REFERENCE TO RELATED APPLICATION

This application is a divisional of U.S. application Ser. No.13/350,384, filed Jan. 13, 2012, which is a divisional of U.S.application Ser. No. 12/135,611, filed Jun. 9, 2008, now abandoned,which is a divisional of U.S. application Ser. No. 10/837,318, filedApr. 30, 2004, now U.S. Pat. No. 7,393,664, which claims the benefit ofU.S. Provisional Application No. 60/467766, filed May 2, 2003, whichapplications are incorporated herein by reference.

STATEMENT AS TO RIGHTS TO INVENTIONS MADE UNDER FEDERALLY SPONSOREDRESEARCH AND DEVELOPMENT

This invention was made with Government support under NREL SubcontractNo. ZCO-30017-02, Prime Contract DE-AC36-98G010337 awarded by theDepartment of Energy. The government has certain rights in thisinvention.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to methods for producing secretedpolypeptides. The present invention also relates to isolated nucleotidesequences encoding signal peptides and nucleic acid constructs, vectors,and host cells comprising the signal peptide sequences operably linkedto nucleotide sequences encoding polypeptides.

2. Description of the Related Art

The recombinant production of a heterologous protein in a fungal hostcell, particularly a filamentous fungal cell such as Aspergillus or ayeast cell such Saccharomyces, may provide for a more desirable vehiclefor producing the protein in commercially relevant quantities.

Recombinant production of a heterologous protein is generallyaccomplished by constructing an expression cassette in which the DNAcoding for the protein is placed under the expression control of apromoter, excised from a regulated gene, suitable for the host cell. Theexpression cassette is introduced into the host cell, usually byplasmid-mediated transformation. Production of the heterologous proteinis then achieved by culturing the transformed host cell under inducingconditions necessary for the proper functioning of the promotercontained on the expression cassette.

Improvement of the recombinant production of proteins generally requiresthe availability of new regulatory sequences which are suitable forcontrolling the expression of the proteins in a host cell.

U.S. Pat. No. 6,015,703 discloses genetic constructs comprising apromoter, a xylanase secretion signal, and a mature beta-glucosidasecoding region. The disclosed constructs, when expressed in recombinantmicrobes, dramatically increase the amount of beta-glucosidase producedrelative to untransformed microbes.

WO 91/17243 discloses an endoglucanase V and the gene thereof fromHumicola insolens DSM 1800.

It is an object of the present invention to provide improved methods forproducing a polypeptide in a fungal host cell using signal peptidesequences.

SUMMARY OF THE INVENTION

The present invention relates to methods for producing a secretedpolypeptide, comprising:

(a) cultivating a fungal host cell in a medium conducive for theproduction of the polypeptide, wherein the fungal host cell comprises anucleic acid construct comprising a first nucleotide sequence encoding asignal peptide operably linked to a second nucleotide sequence encodingthe polypeptide, wherein the first nucleotide sequence is foreign to thesecond nucleotide sequence, the 3′ end of the first nucleotide sequenceis immediately upstream of the initiator codon of the second nucleotidesequence, and the first nucleotide sequence is selected from the groupconsisting of:

-   -   (i) a nucleotide sequence encoding a signal peptide having an        amino acid sequence which has at least 70% identity with SEQ ID        NO: 37;    -   (ii) a nucleotide sequence having at least 70% homology with SEQ        ID NO: 36; and    -   (iii) a nucleotide sequence which hybridizes under stringency        conditions with the nucleotides of SEQ ID NO: 36, or its        complementary strand, wherein the stringency conditions are        defined as prehybridization, hybridization, and washing        post-hybridization at 5° C. to 10° C. below the calculated T_(m)        in 0.9 M NaCl, 0.09 M Tris-HCl pH 7.6, 6 mM EDTA, 0.5% NP-40, 1×        Denhardt's solution, 1 mM sodium pyrophosphate, 1 mM sodium        monobasic phosphate, 0.1 mM ATP, and 0.2 mg of yeast RNA per ml,        and washing once in 6×SCC plus 0.1% SDS for 15 minutes and twice        each for 15 minutes using 6×SSC at 5° C. to 10° C. below the        calculated T_(m); and

(b) isolating the secreted polypeptide from the cultivation medium.

The present invention also relates to isolated signal peptide sequencesand to constructs, vectors, and fungal host cells comprising the signalpeptide sequences operably linked to nucleotide sequences encodingpolypeptides.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 shows a restriction map of pAILo1.

FIG. 2 shows a restriction map of pMJ04.

FIG. 3 shows a restriction map of pCaHj527.

FIG. 4 shows a restriction map of pMT2188.

FIG. 5 shows a restriction map of pCaHj568.

FIG. 6 shows a restriction map of pMJ05.

FIG. 7 shows a restriction map of pSMai130.

FIG. 8 shows the DNA sequence (SEQ ID NO: 34) and deduced amino acidsequence (SEQ ID NO: 35) of the secretion signal sequence of anAspergillus oryzae beta-glucosidase.

FIG. 9 shows the DNA sequence (SEQ ID NO: 36) and deduced amino acidsequence (SEQ ID NO: 37) of the secretion signal sequence of a Humicolainsolens endoglucanase V.

FIG. 10 shows a restriction map of pSMai135.

FIG. 11 shows a restriction map of pSATe101.

FIG. 12 shows a restriction map of pSATe111.

FIG. 13 shows a restriction map of pALFd1.

FIG. 14 shows a restriction map of pAILo2.

FIG. 15 shows a restriction map of pEJG97.

FIGS. 16A and 16B show the genomic DNA sequence and the deduced aminoacid sequence of an Aspergillus fumigatus beta-glucosidase (SEQ ID NOS:46 and 47, respectively). The predicted signal peptide is underlined andpredicted introns are italicized.

FIG. 17 shows a restriction map of pCR4Blunt-TOPOAfcDNA5′.

FIG. 18 shows a restriction map of pCR4Blunt-TOPOAfcDNA3′.

FIG. 19 shows a restriction map of pCR4Blunt-TOPOAfcDNA.

FIG. 20 shows a restriction map of pALFd7.

FIG. 21 shows a restriction map of pALFd6.

DETAILED DESCRIPTION OF THE INVENTION

The present invention relates to methods for producing a polypeptide,comprising: (a) cultivating a fungal host cell in a medium conducive forthe production of the polypeptide, wherein the fungal host cellcomprises a nucleic acid construct comprising a first nucleotidesequence encoding a signal peptide operably linked to a secondnucleotide sequence encoding the polypeptide, wherein the firstnucleotide sequence is foreign to the second nucleotide sequence and the3′ end of the first nucleotide sequence is immediately upstream of theinitiator codon of the second nucleotide sequence. The first nucleotidesequence is selected from the group consisting of: (i) a nucleotidesequence encoding a signal peptide having an amino acid sequence whichhas at least 70% identity with SEQ ID NO: 37; (ii) a nucleotide sequencehaving at least 70% homology with SEQ ID NO: 36; and (iii) a nucleotidesequence which hybridizes under stringency conditions with thenucleotides of SEQ ID NO: 36, or its complementary strand, wherein thestringency conditions are defined as prehybridization, hybridization,and washing post-hybridization at 5° C. to 10° C. below the calculatedT_(m) in 0.9 M NaCl, 0.09 M Tris-HCl pH 7.6, 6 mM EDTA, 0.5% NP-40, 1×Denhardt's solution, 1 mM sodium pyrophosphate, 1 mM sodium monobasicphosphate, 0.1 mM ATP, and 0.2 mg of yeast RNA per ml, and washing oncein 6×SCC plus 0.1% SDS for 15 minutes and twice each for 15 minutesusing 6×SSC at 5° C. to 10° C. below the calculated T_(m); and (b)isolating the secreted polypeptide from the cultivation medium.

In the production methods of the present invention, the fungal hostcells are cultivated in a nutrient medium suitable for production of thepolypeptide using methods known in the art. For example, the cell may becultivated by shake flask cultivation, or small-scale or large-scalefermentation (including continuous, batch, fed-batch, or solid statefermentations) in laboratory or industrial fermentors performed in asuitable medium and under conditions allowing the polypeptide to beexpressed and/or isolated. The cultivation takes place in a suitablenutrient medium comprising carbon and nitrogen sources and inorganicsalts, using procedures known in the art. Suitable media are availablefrom commercial suppliers or may be prepared according to publishedcompositions (e.g., in catalogues of the American Type CultureCollection).

The polypeptides may be detected using methods known in the art that arespecific for the polypeptides. These detection methods may include useof specific antibodies, formation of an enzyme product, or disappearanceof an enzyme substrate.

In the methods of the present invention, the fungal cell preferablyproduces at least about 25% more, more preferably at least about 50%more, more preferably at least about 75% more, more preferably at leastabout 100% more, even more preferably at least about 200% more, mostpreferably at least about 300% more, and even most preferably at leastabout 400% more polypeptide relative to a fungal cell containing anative signal peptide sequence operably linked to a nucleotide sequenceencoding the polypeptide when cultured under identical productionconditions.

The resulting secreted polypeptide can be recovered directly from themedium by methods known in the art. For example, the polypeptide may berecovered from the nutrient medium by conventional procedures including,but not limited to, centrifugation, filtration, extraction,spray-drying, evaporation, or precipitation.

The polypeptides may be purified by a variety of procedures known in theart including, but not limited to, chromatography (e.g., ion exchange,affinity, hydrophobic, chromatofocusing, and size exclusion),electrophoretic procedures (e.g., preparative isoelectric focusing),differential solubility (e.g., ammonium sulfate precipitation),SDS-PAGE, or extraction (see, e.g., Protein Purification, J.-C. Jansonand Lars Ryden, editors, VCH Publishers, New York, 1989).

Signal Peptide Sequences

The term “signal peptide sequence” is defined herein as a peptide codingregion that codes for an amino acid sequence linked to the aminoterminus of a polypeptide and directs the encoded polypeptide into thecell's secretory pathway.

The term “operably linked” is defined herein as a configuration in whicha control sequence, e.g., a signal peptide sequence, is appropriatelyplaced at a position relative to a coding sequence such that the controlsequence directs the production of a polypeptide encoded by the codingsequence.

The term “coding sequence” is defined herein as a nucleotide sequencethat is transcribed into mRNA which is translated into a polypeptidewhen placed under the control of the appropriate control sequences. Theboundaries of the coding sequence are generally determined by the startcodon located at the beginning of the open reading frame of the 5′ endof the mRNA and a stop codon located at the 3′ end of the open readingframe of the mRNA. A coding sequence can include, but is not limited to,genomic DNA, cDNA, semisynthetic, synthetic, and recombinant nucleotidesequences.

The 5′ end of the polypeptide coding sequence may contain a nativesignal peptide coding region naturally linked in translation readingframe with the segment of the coding region which encodes thepolypeptide, wherein the signal peptide coding region of the presentinvention may simply replace the natural signal peptide coding region inorder to enhance secretion of the polypeptide. Alternatively, the 5′ endof the polypeptide coding sequence may lack a native signal peptidecoding region.

In the methods of the present invention, the signal peptide sequence isforeign to the nucleotide sequence encoding a polypeptide of interest,but the signal peptide sequence or nucleotide sequence may be native tothe fungal host cell.

In a first aspect, the isolated nucleotide sequences encoding a signalpeptide have a degree of identity to SEQ ID NO: 37 of at least about70%, preferably at least about 75%, more preferably at least about 80%,more preferably at least about 85%, even more preferably at least about90%, most preferably at least about 95%, and even most preferably atleast about 97%, which have the ability to direct a polypeptide into acell's secretory pathway (hereinafter “homologous signal peptides”). Ina preferred aspect, the homologous signal peptides have an amino acidsequence which differs by five amino acids, preferably by four aminoacids, more preferably by three amino acids, even more preferably by twoamino acids, and most preferably by one amino acid from SEQ ID NO: 37.For purposes of the present invention, the degree of identity betweentwo amino acid sequences is determined by the Clustal method (Higgins,1989, CABIOS 5: 151-153) using the LASERGENE™ MEGALIGN™ software(DNASTAR, Inc., Madison, Wis.) with an identity table and the followingmultiple alignment parameters: Gap penalty of 10 and gap length penaltyof 10. Pairwise alignment parameters are Ktuple=1, gap penalty=3,windows=5, and diagonals=5.

Preferably, the nucleotide sequences encode signal peptides thatcomprise the amino acid sequence of SEQ ID NO: 37, or allelic variantsthereof; or fragments thereof that have the ability to direct thepolypeptide into a cell's secretory pathway. In a more preferred aspect,a nucleotide sequence of the present invention encodes a signal peptidethat comprises the amino acid sequence of SEQ ID NO: 37. In anotherpreferred aspect, the nucleotide sequence encodes a signal peptide thatconsists of the amino acid sequence of SEQ ID NO: 37, or a fragmentthereof, wherein the signal peptide fragment has the ability to direct apolypeptide into a cell's secretory pathway. In another more preferredaspect, the nucleotide sequence of the present invention encodes asignal peptide that consists of the amino acid sequence of SEQ ID NO:37.

The present invention also encompasses nucleotide sequences which encodea signal peptide having the amino acid sequence of SEQ ID NO: 37, whichdiffer from SEQ ID NO: 36 by virtue of the degeneracy of the geneticcode. The present invention also relates to subsequences of SEQ ID NO:36 which encode fragments of SEQ ID NO: 37 which have the ability todirect a polypeptide into a cell's secretory pathway.

A subsequence of SEQ ID NO: 36 is a nucleic acid sequence encompassed bySEQ ID NO: 36 except that one or more nucleotides from the 5′ and/or 3′end have been deleted. Preferably, a subsequence contains at least 45nucleotides, more preferably at least 51 nucleotides, and mostpreferably at least 57 nucleotides. A fragment of SEQ ID NO: 37 is apolypeptide having one or more amino acids deleted from the amino and/orcarboxy terminus of this amino acid sequence. Preferably, a fragmentcontains at least 15 amino acid residues, more preferably at least 17amino acid residues, and most preferably at least 19 amino acidresidues.

An allelic variant denotes any of two or more alternative forms of agene occupying the same chomosomal locus. Allelic variation arisesnaturally through mutation, and may result in polymorphism withinpopulations. Gene mutations can be silent (no change in the encodedsignal peptide) or may encode signal peptides having altered amino acidsequences. The allelic variant of a signal peptide is a peptide encodedby an allelic variant of a gene.

In a preferred aspect, the first nucleotide sequence is the signalpeptide coding sequence of the endoglucanase V gene contained inHumicola insolens DSM 1800.

In a second aspect, the isolated nucleic acid sequences encoding asignal peptide have a degree of homology to SEQ ID NO: 36 of at leastabout 70%, preferably at least about 75%, more preferably at least about80%, more preferably at least about 85%, even more preferably at leastabout 90% homology, most preferably at least about 95% homology, andeven most preferably at least about 97% homology, which encode a signalpeptide; or allelic variants and subsequences of SEQ ID NO: 36 whichencode signal peptide fragments which have the ability to direct apolypeptide into a cell's secretory pathway. For purposes of the presentinvention, the degree of homology between two nucleic acid sequences isdetermined by the Wilbur-Lipman method (Wilbur and Lipman, 1983,Proceedings of the National Academy of Science USA 80: 726-730) usingthe LASERGENE™ MEGALIGN™ software (DNASTAR, Inc., Madison, Wis.) with anidentity table and the following multiple alignment parameters: Gappenalty of 10 and gap length penalty of 10. Pairwise alignmentparameters are Ktuple=3, gap penalty=3, and windows=20.

In a third aspect, the isolated nucleotide sequences encode signalpeptides, wherein the nucleotide sequences hybridize under stringencyconditions with the nucleotides of SEQ ID NO: 36, or its complementarystrand (J. Sambrook, E. F. Fritsch, and T. Maniatus, 1989, MolecularCloning, A Laboratory Manual, 2d edition, Cold Spring Harbor, N.Y.).

The nucleotide sequence of SEQ ID NO: 36 or a subsequence thereof, aswell as the amino acid sequence of SEQ ID NO: 37 or a fragment thereof,may be used to design a nucleic acid probe to identify and clone DNAencoding signal peptides from strains of different genera or speciesaccording to methods well known in the art. In particular, such probescan be used for hybridization with the genomic or cDNA of the genus orspecies of interest, following standard Southern blotting procedures, inorder to identify and isolate the corresponding gene therein. Suchprobes can be considerably shorter than the entire sequence, but shouldbe at least 15, preferably at least 25, and more preferably at least 35nucleotides in length. Both DNA and RNA probes can be used. The probesare typically labeled for detecting the corresponding gene (for example,with ³²P, ³H, ³⁵S, biotin, or avidin). Such probes are encompassed bythe present invention.

Thus, a genomic DNA or cDNA library prepared from such other organismsmay be screened for DNA which hybridizes with the probes described aboveand which encodes a signal peptide. Genomic or other DNA from such otherorganisms may be separated by agarose or polyacrylamide gelelectrophoresis, or other separation techniques. DNA from the librariesor the separated DNA may be transferred to and immobilized onnitrocellulose or other suitable carrier material. In order to identifya clone or DNA which is homologous with SEQ ID NO: 36 or a subsequencethereof, the carrier material is used in a Southern blot. For purposesof the present invention, hybridization indicates that the nucleic acidsequence hybridizes to a labeled nucleic acid probe corresponding to thenucleic acid sequence shown in SEQ ID NO: 36, its complementary strand,or a subsequence thereof, under stringency conditions defined herein.Molecules to which the nucleic acid probe hybridizes under theseconditions can be detected using X-ray film.

In a preferred aspect, the nucleic acid probe is a nucleotide sequencewhich encodes the signal peptide of SEQ ID NO: 37, or a subsequencethereof. In another preferred aspect, the nucleic acid probe is SEQ IDNO: 36. In another preferred aspect, the nucleic acid probe is thesignal peptide coding sequence of the endoglucanase V gene contained inHumicola insolens DSM 1800.

For short probes which are about 15 nucleotides to about 60 nucleotidesin length, stringency conditions are defined as prehybridization,hybridization, and washing post-hybridization at 5° C. to 10° C. belowthe calculated T_(m) using the calculation according to Bolton andMcCarthy (1962, Proceedings of the National Academy of Sciences USA 48:1390) in 0.9 M NaCl, 0.09 M Tris-HCl pH 7.6, 6 mM EDTA, 0.5% NP-40, 1×Denhardt's solution, 1 mM sodium pyrophosphate, 1 mM sodium monobasicphosphate, 0.1 mM ATP, and 0.2 mg of yeast RNA per ml following standardSouthern blotting procedures.

For short probes which are about 15 nucleotides to about 60 nucleotidesin length, the carrier material is washed once in 6× SCC plus 0.1% SDSfor 15 minutes and twice each for 15 minutes using 6× SSC at 5° C. to10° C. below the calculated T_(m).

In a fourth aspect, the isolated nucleic acid sequences encode variantsof the signal peptide having an amino acid sequence of SEQ ID NO: 37comprising a substitution, deletion, and/or insertion of one or moreamino acids.

The amino acid sequences of the variant signal peptides may differ fromthe amino acid sequence of SEQ ID NO: 37 by an insertion or deletion ofone or more amino acid residues and/or the substitution of one or moreamino acid residues by different amino acid residues. Preferably, aminoacid changes are of a minor nature, such as conservative amino acidsubstitutions that do not significantly affect the activity of thesignal peptide; or small deletions, typically of one to about 5 aminoacids.

Examples of conservative substitutions are within the group of basicamino acids (arginine, lysine and histidine), acidic amino acids(glutamic acid and aspartic acid), polar amino acids (glutamine andasparagine), hydrophobic amino acids (leucine, isoleucine and valine),aromatic amino acids (phenylalanine, tryptophan and tyrosine), and smallamino acids (glycine, alanine, serine, threonine and methionine). Aminoacid substitutions that do not generally alter the specific activity areknown in the art and are described, for example, by H. Neurath and R. L.Hill, 1979, In, The Proteins, Academic Press, New York. The mostcommonly occurring exchanges are Ala/Ser, Val/Ile, Asp/Glu, Thr/Ser,Ala/Gly, Ala/Thr, Ser/Asn, Ala/Val, Ser/Gly, Tyr/Phe, Ala/Pro, Lys/Arg,Asp/Asn, Leu/Ile, Leu/Val, Ala/Glu, and Asp/Gly as well as these inreverse.

The present invention also relates to the isolated signal peptidesequences disclosed supra.

Polypeptide Encoding Nucleotide Sequences

The polypeptide encoded by the second nucleotide sequence may be nativeor heterologous to the fungal host cell of interest.

The term “polypeptide” is not meant herein to refer to a specific lengthof the encoded product and, therefore, encompasses peptides,oligopeptides, and proteins. The term “heterologous polypeptide” isdefined herein as a polypeptide which is not native to the fungal cell,a native polypeptide in which modifications have been made to alter thenative sequence, or a native polypeptide whose expression isquantitatively altered as a result of a manipulation of the geneencoding the polypeptide by recombinant DNA techniques. The fungal cellmay contain one or more copies of the nucleotide sequence encoding thepolypeptide.

Preferably, the polypeptide is a hormone or variant thereof, enzyme,receptor or portion thereof, antibody or portion thereof, or reporter.In a preferred aspect, the polypeptide is an oxidoreductase,transferase, hydrolase, lyase, isomerase, or ligase. In a more preferredaspect, the polypeptide is an aminopeptidase, amylase, carbohydrase,carboxypeptidase, catalase, cellulase, cellobiohydrolase, chitinase,cutinase, cyclodextrin glycosyltransferase, deoxyribonuclease,endoglucanase, esterase, alpha-galactosidase, beta-galactosidase,glucoamylase, alpha-glucosidase, beta-glucosidase, invertase, laccase,lipase, mannosidase, mutanase, oxidase, pectinolytic enzyme, peroxidase,phospholipase, phytase, polyphenoloxidase, proteolytic enzyme,ribonuclease, transglutaminase, xylanase, or beta-xylosidase. In a mostpreferred aspect, the polypeptide is an endoglucanase,cellobiohydrolase, and/or beta-glucosidase useful in convertingcellulose to glucose including, but not limited to, endoglucanase I,endoglucanase II, endoglucanse III, endoglucanase IV, endoglucanase V,cellobiohydrolase I, cellobiohydrolase II, and beta-glucosidase.Endoglucanase and cellobiohydrolase enzymes are collectively referred toas “cellulases.”

The nucleotide sequence encoding a polypeptide of interest may beobtained from any prokaryotic, eukaryotic, or other source. For purposesof the present invention, the term “obtained from” as used herein inconnection with a given source shall mean that the polypeptide isproduced by the source or by a cell in which a gene from the source hasbeen inserted.

The techniques used to isolate or clone a nucleotide sequence encoding apolypeptide of interest are known in the art and include isolation fromgenomic DNA, preparation from cDNA, or a combination thereof. Thecloning of the nucleotide sequence from such genomic DNA can beeffected, e.g., by using the well known polymerase chain reaction (PCR).See, for example, Innis et al., 1990, PCR Protocols: A Guide to Methodsand Application, Academic Press, New York. The cloning procedures mayinvolve excision and isolation of a desired nucleotide fragmentcomprising the nucleotide sequence encoding the polypeptide, insertionof the fragment into a vector molecule, and incorporation of therecombinant vector into the mutant fungal cell where multiple copies orclones of the nucleotide sequence will be replicated. The nucleotidesequence may be of genomic, cDNA, RNA, semisynthetic, synthetic origin,or any combinations thereof.

In the methods of the present invention, the polypeptide may alsoinclude a fused or hybrid polypeptide in which another polypeptide isfused at the N-terminus or the C-terminus of the polypeptide or fragmentthereof. A fused polypeptide is produced by fusing a nucleotide sequence(or a portion thereof) encoding one polypeptide to a nucleotide sequence(or a portion thereof) encoding another polypeptide. Techniques forproducing fusion polypeptides are known in the art, and include,ligating the coding sequences encoding the polypeptides so that they arein frame and expression of the fused polypeptide is under control of thesame promoter(s) and terminator. The hybrid polypeptide may comprise acombination of partial or complete polypeptide sequences obtained fromat least two different polypeptides wherein one or more may beheterologous to the mutant fungal cell.

Nucleic Acid Constructs

The present invention also relates to nucleic acid constructs comprisinga nucleotide sequence encoding a polypeptide operably linked to a signalpeptide sequence of the present invention and one or more controlsequences which direct the expression of the coding sequence in asuitable host cell under conditions compatible with the controlsequences. Expression will be understood to include any step involved inthe production of the polypeptide including, but not limited to,transcription, post-transcriptional modification, translation,post-translational modification, and secretion.

“Nucleic acid construct” is defined herein as a nucleotide molecule,either single- or double-stranded, which is isolated from a naturallyoccurring gene or which has been modified to contain segments of nucleicacids combined and juxtaposed in a manner that would not otherwise existin nature. The term nucleic acid construct is synonymous with the termexpression cassette when the nucleic acid construct contains a codingsequence and all the control sequences required for expression of thecoding sequence.

An isolated nucleotide sequence encoding a polypeptide may be furthermanipulated in a variety of ways to provide for expression of thepolypeptide. Manipulation of the nucleotide sequence prior to itsinsertion into a vector may be desirable or necessary depending on theexpression vector. The techniques for modifying nucleotide sequencesutilizing recombinant DNA methods are well known in the art.

In the methods of the present invention, the nucleotide sequence maycomprise one or more native control sequences or one or more of thenative control sequences may be replaced with one or more controlsequences foreign to the nucleotide sequence for improving expression ofthe coding sequence in a host cell.

The term “control sequences” is defined herein to include all componentswhich are necessary or advantageous for the expression of a polypeptideof interest. Each control sequence may be native or foreign to thenucleotide sequence encoding the polypeptide. Such control sequencesinclude, but are not limited to, a leader, polyadenylation sequence,propeptide sequence, signal peptide sequence of the present invention,and transcription terminator. At a minimum, the control sequencesinclude a signal peptide sequence of the present invention, andtranscriptional and translational stop signals. The control sequencesmay be provided with linkers for the purpose of introducing specificrestriction sites facilitating ligation of the control sequences withthe coding region of the nucleotide sequence encoding a polypeptide.

The control sequence may be an appropriate promoter sequence, which isrecognized by a host cell for expression of the nucleotide sequence. Thepromoter sequence contains transcriptional control sequences whichmediate the expression of the polypeptide. The promoter may be anysequence which shows transcriptional activity in the host cell of choiceincluding mutant, truncated, and hybrid promoters, and may be obtainedfrom genes encoding extracellular or intracellular polypeptides eitherhomologous or heterologous to the host cell.

Examples of suitable promoters for directing the transcription of thenucleic acid constructs of the present invention in a filamentous fungalhost cell are promoters obtained from the genes for Aspergillus oryzaeTAKA amylase, Rhizomucor miehei aspartic proteinase, Aspergillus nigerneutral alpha-amylase, Aspergillus niger acid stable alpha-amylase,Aspergillus niger or Aspergillus awamori glucoamylase (glaA), Rhizomucormiehei lipase, Aspergillus oryzae alkaline protease, Aspergillus oryzaetriose phosphate isomerase, Aspergillus nidulans acetamidase, Fusariumvenenatum amyloglucosidase, Fusarium oxysporum trypsin-like protease (WO96/00787), Trichoderma reesei cellobiohydrolase I, Trichoderma reeseicellobiohydrolase II, Trichoderma reesei endoglucanase I, Trichodermareesei endoglucanase II, Trichoderma reesei endoglucanase III,Trichoderma reesei endoglucanase IV, Trichoderma reesei endoglucanase V,Trichoderma reesei xylanase I, Trichoderma reesei xylanase II,Trichoderma reesei beta-xylosidase, as well as the NA2-tpi promoter (ahybrid of the promoters from the genes for Aspergillus niger neutralalpha-amylase and Aspergillus oryzae triose phosphate isomerase); andmutant, truncated, and hybrid promoters thereof.

In a yeast host, useful promoters are obtained from the genes forSaccharomyces cerevisiae enolase (ENO-1), Saccharomyces cerevisiaegalactokinase (GAL1), Saccharomyces cerevisiae alcoholdehydrogenase/glyceraldehyde-3-phosphate dehydrogenase (ADH1,ADH2/GAP),Saccharomyces cerevisiae triose phosphate isomerase (TPI), Saccharomycescerevisiae metallothionine (CUP1), and Saccharomyces cerevisiae3-phosphoglycerate kinase. Other useful promoters for yeast host cellsare described by Romanos et al., 1992, Yeast 8: 423-488.

The control sequence may be a suitable transcription terminatorsequence, which is recognized by a host cell to terminate transcription.The terminator sequence is operably linked to the 3′ terminus of thenucleotide sequence encoding the polypeptide. Any terminator which isfunctional in the host cell of choice may be used in the presentinvention.

Preferred terminators for filamentous fungal host cells are obtainedfrom the genes for Aspergillus oryzae TAKA amylase, Aspergillus nigerglucoamylase, Aspergillus nidulans anthranilate synthase, Aspergillusniger alpha-glucosidase, and Fusarium oxysporum trypsin-like protease.

Preferred terminators for yeast host cells are obtained from the genesfor Saccharomyces cerevisiae enolase, Saccharomyces cerevisiaecytochrome C (CYC1), and Saccharomyces cerevisiaeglyceraldehyde-3-phosphate dehydrogenase. Other useful terminators foryeast host cells are described by Romanos et al., 1992, supra.

The control sequence may also be a suitable leader sequence, anontranslated region of an mRNA which is important for translation bythe host cell. The leader sequence is operably linked to the 5′ terminusof the nucleotide sequence encoding the polypeptide. Any leader sequencethat is functional in the host cell of choice may be used in the presentinvention.

Preferred leaders for filamentous fungal host cells are obtained fromthe genes for Aspergillus oryzae TAKA amylase and Aspergillus nidulanstriose phosphate isomerase.

Suitable leaders for yeast host cells are obtained from the genes forSaccharomyces cerevisiae enolase (ENO-1), Saccharomyces cerevisiae3-phosphoglycerate kinase, Saccharomyces cerevisiae alpha-factor, andSaccharomyces cerevisiae alcoholdehydrogenase/glyceraldehyde-3-phosphate dehydrogenase (ADH2/GAP).

The control sequence may also be a polyadenylation sequence, which isoperably linked to the 3′ terminus of the nucleotide sequence and which,when transcribed, is recognized by the host cell as a signal to addpolyadenosine residues to transcribed mRNA. Any polyadenylation sequencewhich is functional in the host cell of choice may be used in thepresent invention.

Preferred polyadenylation sequences for filamentous fungal host cellsare obtained from the genes for Aspergillus oryzae TAKA amylase,Aspergillus niger glucoamylase, Aspergillus nidulans anthranilatesynthase, Fusarium oxysporum trypsin-like protease, and Aspergillusniger alpha-glucosidase.

Useful polyadenylation sequences for yeast host cells are described byGuo and Sherman, 1995, Molecular Cellular Biology 15: 5983-5990.

The control sequence may also be a propeptide coding region that codesfor an amino acid sequence positioned at the amino terminus of apolypeptide. The resultant polypeptide is known as a proenzyme orpropolypeptide (or a zymogen in some cases). A propolypeptide isgenerally inactive and can be converted to a mature active polypeptideby catalytic or autocatalytic cleavage of the propeptide from thepropolypeptide. The propeptide coding region may be obtained from thegenes for Saccharomyces cerevisiae alpha-factor, Rhizomucor mieheiaspartic proteinase, and Myceliophthora thermophila laccase (WO95/33836).

Where both signal peptide and propeptide regions are present at theamino terminus of a polypeptide, the propeptide region is positionednext to the amino terminus of a polypeptide and the signal peptideregion is positioned next to the amino terminus of the propeptideregion.

It may also be desirable to add regulatory sequences which allow theregulation of the expression of the polypeptide relative to the growthof the host cell. Examples of regulatory systems are those which causethe expression of the gene to be turned on or off in response to achemical or physical stimulus, including the presence of a regulatorycompound. In yeast, the ADH2 system or GAL1 system may be used. Infilamentous fungi, the TAKA alpha-amylase promoter, Aspergillus nigerglucoamylase promoter, and Aspergillus oryzae glucoamylase promoter maybe used as regulatory sequences. Other examples of regulatory sequencesare those which allow for gene amplification. In eukaryotic systems,these include the dihydrofolate reductase gene which is amplified in thepresence of methotrexate, and the metallothionein genes which areamplified with heavy metals. In these cases, the nucleotide sequenceencoding the polypeptide would be operably linked with the regulatorysequence.

Expression Vectors

The present invention also relates to recombinant expression vectorscomprising a signal peptide sequence of the present invention, anucleotide sequence encoding a polypeptide of interest, andtranscriptional and translational stop signals. The various nucleotideand control sequences described above may be joined together to producea recombinant expression vector which may include one or more convenientrestriction sites to allow for insertion or substitution of the promoterand/or nucleotide sequence encoding the polypeptide at such sites.Alternatively, the nucleotide sequence may be expressed by inserting thenucleotide sequence or a nucleic acid construct comprising the signalpeptide sequence and/or nucleotide sequence encoding the polypeptideinto an appropriate vector for expression. In creating the expressionvector, the coding sequence is located in the vector so that the codingsequence is operably linked with a signal peptide sequence of thepresent invention and one or more appropriate control sequences forexpression.

The recombinant expression vector may be any vector (e.g., a plasmid orvirus) which can be conveniently subjected to recombinant DNA proceduresand can bring about the expression of the nucleotide sequence. Thechoice of the vector will typically depend on the compatibility of thevector with the host cell into which the vector is to be introduced. Thevectors may be linear or closed circular plasmids.

The vectors of the present invention preferably contain one or moreselectable markers which permit easy selection of transformed cells. Aselectable marker is a gene the product of which provides for biocide orviral resistance, resistance to heavy metals, prototrophy to auxotrophs,and the like. Suitable markers for yeast host cells include, but are notlimited to, ADE2, HIS3, LEU2, LYS2, MET3, TRP1, and URA3. Selectablemarkers for use in a filamentous fungal host cell include, but are notlimited to, amdS (acetamidase), argB (ornithine carbamoyltransferase),bar (phosphinothricin acetyltransferase), hygB (hygromycinphosphotransferase), niaD (nitrate reductase), pyrG(orotidine-5′-phosphate decarboxylase), sC (sulfate adenyltransferase),trpC (anthranilate synthase), as well as equivalents thereof. Preferredfor use in an Aspergillus cell are the amdS and pyrG genes ofAspergillus nidulans or Aspergillus oryzae and the bar gene ofStreptomyces hygroscopicus.

The vector may be an autonomously replicating vector, i.e., a vectorwhich exists as an extrachromosomal entity, the replication of which isindependent of chromosomal replication, e.g., a plasmid, anextrachromosomal element, a minichromosome, or an artificial chromosome.The vector may contain any means for assuring self-replication.Alternatively, the vector may be one which, when introduced into thehost cell, is integrated into the genome and replicated together withthe chromosome(s) into which it has been integrated. Furthermore, asingle vector or plasmid or two or more vectors or plasmids whichtogether contain the total DNA to be introduced into the genome of thehost cell, or a transposon may be used.

The vectors of the present invention preferably contain an element(s)that permits stable integration of the vector into the host cell'sgenome or autonomous replication of the vector in the cell independentof the genome.

For integration into the host cell genome, the vector may rely on thenucleotide sequence encoding the polypeptide or any other element of thevector for stable integration of the vector into the genome byhomologous or nonhomologous recombination. Alternatively, the vector maycontain additional nucleotide sequences for directing integration byhomologous recombination into the genome of the host cell. Theadditional nucleotide sequences enable the vector to be integrated intothe host cell genome at a precise location(s) in the chromosome(s). Toincrease the likelihood of integration at a precise location, theintegrational elements should preferably contain a sufficient number ofnucleotides, such as 100 to 1,500 base pairs, preferably 400 to 1,500base pairs, and most preferably 800 to 1,500 base pairs, which arehighly homologous with the corresponding target sequence to enhance theprobability of homologous recombination. The integrational elements maybe any sequence that is homologous with the target sequence in thegenome of the host cell. Furthermore, the integrational elements may benon-encoding or encoding nucleotide sequences. On the other hand, thevector may be integrated into the genome of the host cell bynon-homologous recombination.

For autonomous replication, the vector may further comprise an origin ofreplication enabling the vector to replicate autonomously in the hostcell in question. The origin of replication may be any plasmidreplicator mediating autonomous replication which functions in a cell.The term “origin of replication” or “plasmid replicator” is definedherein as a nucleotide sequence that enables a plasmid or vector toreplicate in vivo.

Examples of origins of replication for use in a yeast host cell are the2 micron origin of replication, ARS1, ARS4, the combination of ARS1 andCEN3, and the combination of ARS4 and CEN6. The origin of replicationmay be one having a mutation which makes its functioningtemperature-sensitive in the host cell (see, e.g., Ehrlich, 1978,Proceedings of the National Academy of Sciences USA 75: 1433).

Examples of origins of replication useful in a filamentous fungal cellare AMA1 and ANS1 (Gems et al., 1991, Gene 98:61-67; Cullen et al.,1987, Nucleic Acids Research 15: 9163-9175; WO 00/24883). Isolation ofthe AMA1 gene and construction of plasmids or vectors comprising thegene can be accomplished according to the methods disclosed in WO00/24883.

More than one copy of a nucleotide sequence encoding a polypeptide maybe inserted into the host cell to increase production of the geneproduct. An increase in the copy number of the nucleotide sequence canbe obtained by integrating at least one additional copy of the sequenceinto the host cell genome or by including an amplifiable selectablemarker gene with the nucleotide sequence where cells containingamplified copies of the selectable marker gene, and thereby additionalcopies of the nucleotide sequence, can be selected for by cultivatingthe cells in the presence of the appropriate selectable agent.

The procedures used to ligate the elements described above to constructthe recombinant expression vectors of the present invention are wellknown to one skilled in the art (see, e.g., Sambrook et al., 1989,supra).

Host Cells

The present invention also relates to recombinant host cells, comprisinga signal peptide sequence of the present invention operably linked to anucleotide sequence encoding a polypeptide, which are advantageouslyused in the recombinant production of the polypeptides. A vectorcomprising a signal peptide sequence of the present invention operablylinked to a nucleotide sequence encoding a polypeptide is introducedinto a host cell so that the vector is maintained as a chromosomalintegrant or as a self-replicating extra-chromosomal vector as describedearlier. The term “host cell” encompasses any progeny of a parent cellthat is not identical to the parent cell due to mutations that occurduring replication. The choice of a host cell will to a large extentdepend upon the gene encoding the polypeptide and its source.

The host cell may be any fungal cell useful in the methods of thepresent invention. “Fungi” as used herein includes the phyla Ascomycota,Basidiomycota, Chytridiomycota, and Zygomycota (as defined by Hawksworthet al., In, Ainsworth and Bisby's Dictionary of The Fungi, 8th edition,1995, CAB International, University Press, Cambridge, UK) as well as theOomycota (as cited in Hawksworth et al., 1995, supra, page 171) and allmitosporic fungi (Hawksworth et al., 1995, supra).

In a preferred aspect, the fungal host cell is a yeast cell. “Yeast” asused herein includes ascosporogenous yeast (Endomycetales),basidiosporogenous yeast, and yeast belonging to the Fungi Imperfecti(Blastomycetes). Since the classification of yeast may change in thefuture, for the purposes of this invention, yeast shall be defined asdescribed in Biology and Activities of Yeast (Skinner, F. A., Passmore,S. M., and Davenport, R. R., eds, Soc. App. Bacteriol. Symposium SeriesNo. 9, 1980).

In a more preferred aspect, the yeast host cell is a Candida, Hansenula,Kluyveromyces, Pichia, Saccharomyces, Schizosaccharomyces, or Yarrowiacell. In a most preferred aspect, the yeast host cell is a Saccharomycescarlsbergensis, Saccharomyces cerevisiae, Saccharomyces diastaticus,Saccharomyces douglasii, Saccharomyces kluyveri, Saccharomyces norbensisor Saccharomyces oviformis cell. In another most preferred aspect, theyeast host cell is a Kluyveromyces lactis cell. In another mostpreferred aspect, the yeast host cell is a Yarrowia lipolytica cell.

In another preferred aspect, the fungal host cell is a filamentousfungal cell. “Filamentous fungi” include all filamentous forms of thesubdivision Eumycota and Oomycota (as defined by Hawksworth et al.,1995, supra). The filamentous fungi are characterized by a mycelial wallcomposed of chitin, cellulose, glucan, chitosan, mannan, and othercomplex polysaccharides. Vegetative growth is by hyphal elongation andcarbon catabolism is obligately aerobic. In contrast, vegetative growthby yeasts such as Saccharomyces cerevisiae is by budding of aunicellular thallus and carbon catabolism may be fermentative.

In a more preferred aspect, the filamentous fungal host cell is a cellof a species of, but not limited to, Acremonium, Aspergillus, Fusarium,Humicola, Mucor, Myceliophthora, Neurospora, Penicillium, Thielavia,Tolypocladium, or Trichoderma.

In an even more preferred aspect, the filamentous fungal host cell is anAspergillus awamori, Aspergillus foetidus, Aspergillus japonicus,Aspergillus nidulans, Aspergillus niger or Aspergillus oryzae cell. Inanother even more preferred aspect, the filamentous fungal host cell isa Fusarium bactridioides, Fusarium cerealis, Fusarium crookwellense,Fusarium culmorum, Fusarium graminearum, Fusarium graminum, Fusariumheterosporum, Fusarium negundi, Fusarium oxysporum, Fusariumreticulatum, Fusarium roseum, Fusarium sambucinum, Fusarium sarcochroum,Fusarium sporotrichioides, Fusarium sulphureum, Fusarium torulosum,Fusarium trichothecioides, or Fusarium venenatum cell. In another evenmore preferred aspect, the filamentous fungal host cell is a Humicolainsolens, Humicola lanuginosa, Mucor miehei, Myceliophthora thermophila,Neurospora crassa, Penicillium purpurogenum, Thielavia terrestris,Trichoderma harzianum, Trichoderma koningii, Trichodermalongibrachiatum, Trichoderma reesei, or Trichoderma viride cell.

In a most preferred aspect, the Fusarium venenatum cell is Fusariumvenenatum A3/5, which was originally deposited as Fusarium graminearumATCC 20334 and recently reclassified as Fusarium venenatum by Yoder andChristianson, 1998, Fungal Genetics and Biology 23: 62-80 and O'Donnellet al., 1998, Fungal Genetics and Biology 23: 57-67; as well astaxonomic equivalents of Fusarium venenatum regardless of the speciesname by which they are currently known. In another preferred aspect, theFusarium venenatum cell is a morphological mutant of Fusarium venenatumA3/5 or Fusarium venenatum ATCC 20334, as disclosed in WO 97/26330.

In another most preferred aspect, the Trichoderma cell is Trichodermareesei ATCC 56765, Trichoderma reesei ATCC 13631, Trichoderma reesei CBS526.94, Trichoderma reesei CBS 529.94, Trichoderma longibrachiatum CBS528.94, Trichoderma longibrachiatum ATCC 2106, Trichodermalongibrachiatum CBS 592.94, Trichoderma viride NRRL 3652, Trichodermaviride CBS 517.94, and Trichoderma viride NIBH FERM/BP 447.

Fungal cells may be transformed by a process involving protoplastformation, transformation of the protoplasts, and regeneration of thecell wall in a manner known per se. Suitable procedures fortransformation of Aspergillus host cells are described in EP 238 023 andYelton et al., 1984, Proceedings of the National Academy of Sciences USA81: 1470-1474. Suitable procedures for transformation of Trichodermareesei host cells is described in Penttila et al, 1987, Gene 61:155-164, and Gruber et al., 1990, Curr Genet. 18(1):71-6. Suitablemethods for transforming Fusarium species are described by Malardier etal., 1989, Gene 78: 147-156 and WO 96/00787. Yeast may be transformedusing the procedures described by Becker and Guarente, In Abelson, J. N.and Simon, M. I., editors, Guide to Yeast Genetics and MolecularBiology, Methods in Enzymology, Volume 194, pp 182-187, Academic Press,Inc., New York; Ito et al., 1983, Journal of Bacteriology 153: 163; andHinnen et al., 1978, Proceedings of the National Academy of Sciences USA75: 1920.

Degradation of Biomass

The present invention also relates to methods for degrading orconverting a cellulose-containing and/or hemicellulose-containingbiomass, comprising treating the biomass with an effective amount of oneor more polypeptides obtained by the methods of the present invention,wherein the one or more polypeptides have enzyme activity against thecellulose-containing and/or hemicellulose-containing biomass. Forexample, the methods of the present invention may be used to produceenzymes and host cells for use in the production of ethanol frombiomass. Ethanol can be produced by enzymatic degradation of biomass andconversion of the released polysaccharides to ethanol. This kind ofethanol is often referred to as bioethanol or biofuel. It can be used asa fuel additive or extender in blends of from less than 1% and up to100% (a fuel substitute).

The methods of the present invention may also be used to produce enzymesand host cells for use in the production of monosaccharides,disaccharides, and polysaccharides as chemical or fermentationfeedstocks from biomass for the production of ethanol, plastics, orother products or intermediates. The enzymes may be in the form of acrude fermentation broth with or without the cells removed or in theform of a semi-purified or purified enzyme preparation. Alternatively, ahost cell of the present invention may be used as a source of one ormore enzymes in a fermentation process with the biomass.

Biomass can include, but is not limited to, wood resources, municipalsolid waste, wastepaper, and crop residues (see, for example, Wiselogelet al., 1995, in Handbook on Bioethanol (Charles E. Wyman, editor), pp.105-118, Taylor & Francis, Washington D.C.; Wyman, 1994, BioresourceTechnology 50: 3-16; Lynd, 1990, Applied Biochemistry and Biotechnology24/25: 695-719; Mosier et al., 1999, Recent Progress in Bioconversion ofLignocellulosics, in Advances in Biochemical Engineering/Biotechnology,T. Scheper, managing editor, Volume 65, pp. 23-40, Springer-Verlag, NewYork).

The predominant polysaccharide in the primary cell wall of biomass iscellulose, the second most abundant is hemi-cellulose, and the third ispectin. The secondary cell wall, produced after the cell has stoppedgrowing, also contains polysaccharides and is strengthened throughpolymeric lignin covalently cross-linked to hemicellulose. Cellulose isa homopolymer of anhydrocellobiose and thus a linearbeta-(1-4)-D-glucan, while hemicelluloses include a variety ofcompounds, such as xylans, xyloglucans, arabinoxylans, and mannans incomplex branched structures with a spectrum of substituents. Althoughgenerally polymorphous, cellulose is found in plant tissue primarily asan insoluble crystalline matrix of parallel glucan chains.Hemicelluloses usually hydrogen bond to cellulose, as well as to otherhemicelluloses, which helps stabilize the cell wall matrix.

Three major classes of glycohydrolases are used to breakdown cellulosicbiomass:

(1) The “endo-1,4-beta-glucanases” or1,4-beta-D-glucan-4-glucanohydrolases (EC 3.2.1.4), which act randomlyon soluble and insoluble 1,4-beta-glucan substrates.

(2) The “exo-1,4-beta-D-glucanases” including both the 1,4-beta-D-glucanglucohydrolases (EC 3.2.1.74), which liberate D-glucose from1,4-beta-D-glucans and hydrolyze D-cellobiose slowly, andcellobiohydrolases (1,4-beta-D-glucan cellobiohydrolases, EC 3.2.1.91),which liberate D-cellobiose from 1,4-beta-glucans.

(3) The “beta-D-glucosidases” or beta-D-glucoside glucohydrolases (EC3.2.1.21), which act to release D-glucose units from cellobiose andsoluble cellodextrins, as well as an array of glycosides.

These three classes of enzymes work together synergistically resultingin efficient decrystallization and hydrolysis of native cellulose frombiomass to yield reducing sugars.

The methods of the present invention may also be used to produce otherenzymes in conjunction with the above-noted enzymes to further degradethe hemicellulose component of the biomass substrate, (see, for example,Brigham et al., 1995, in Handbook on Bioethanol (Charles E. Wyman,editor), pp. 119-141, Taylor & Francis, Washington D.C.; Lee, 1997,Journal of Biotechnology 56: 1-24). Such enzymes include, but are notlimited to, enzymes that degrade beta-1,3-1,4-glucan such asendo-beta-1,3(4)-glucanase, endoglucanase (beta-glucanase, cellulase),and beta-glucosidase; degrade xyloglucans such as xyloglucanase,endoglucanase, and cellulase; degrade xylan such as xylanase,xylosidase, alpha-arabinofuranosidase, alpha-glucuronidase, and acetylxylan esterase; degrade mannan such as mannanase, mannosidase,alpha-galactosidase, and mannan acetyl esterase; degrade galactan suchas galactanase; degrade arabinan such as arabinanase; degradehomogalacturonan such as pectate lyase, pectin lyase, pectate lyase,polygalacturonase, pectin acetyl esterase, and pectin methyl esterase;degrade rhamnogalacturonan such as alpha-arabinofuranosidase,beta-galactosidase, galactanase, arabinanase, alpha-arabinofuranosidase,rhamnogalacturonase, rhamnogalacturonan lyase, and rhamnogalacturonanacetyl esterase; degrade xylogalacturonan such as xylogalacturonosidase,xylogalacturonase, and rhamnogalacturonan lyase; and degrade lignin suchas lignin peroxidases, manganese-dependent peroxidases, hybridperoxidases, with combined properties of lignin peroxidases andmanganese-dependent peroxidases, and laccases. Other enzymes includeesterases, lipases, oxidases, phospholipases, phytases, proteases, andperoxidases.

The present invention is further described by the following exampleswhich should not be construed as limiting the scope of the invention.

EXAMPLES Strains

Trichoderma reesei RutC30 (ATCC 56765; Montenecourt and Eveleigh, 1979,Adv. Chem. Ser. 181: 289-301) was derived from Trichoderma reesei Qm6A(ATCC 13631; Mandels and Reese, 1957, J. Bacteriol. 73: 269-278).Trichoderma reesei RutC30 and Saccharomyces cerevisiae YNG318 (MATα,ura3-52, leu-2Δ2, pep4A1, his4-539) (WO97/07205) were used as hosts forexpression of Aspergillus oryzae beta-glucosidase. Aspergillus fumigatusPaHa34 was used as the source of the Family GH3A beta-glucosidase.

Media and Buffer Solutions

YP medium was composed per liter of 10 g of yeast extract and 20 g ofbactopeptone.

Yeast selection medium was composed per liter of 6.7 g of yeast nitrogenbase, 0.8 g of complete supplement mixture (CSM, Qbiogene, Inc.,Carlsbad, Calif.; missing uracil and containing 40 mg/ml of adenine), 5g of casamino acids (without amino acids), 100 ml of 0.5 M succinate pH5.0, 40 ml of 50% glucose, 1 ml of 100 mM CuSO₄, 50 mg of ampicillin,and 25 mg of chloramphenicol.

Yeast selection plate medium was composed per liter of yeast selectionmedium supplemented with 20 g of bacto agar and 150 mg of5-bromo-4-chloro-3-indolyl-beta-D-glucopyranoside (X-Glc, INALCO SPA,Milano, Italy) but lacking both ampicillin and chloramphenicol.

COVE selection plates were composed per liter of 342.3 g of sucrose, 20ml of COVE salt solution, 10 mM acetamide, 15 mM CsCl₂, and 25 g ofNoble agar.

COVE2 plates were composed per liter of 30 g of sucrose, 20 ml COVE saltsolution, 10 mM acetamide, and 25 g of Noble agar.

COVE salt solution was composed per liter of 26 g of KCl, 26 g ofMgSO₄.7H₂O, 76 g of KH₂PO₄, and 50 ml of COVE trace metals.

COVE trace metals solution was composed per liter of 0.04 g ofNaB₄O₇.10H₂O, 0.4 g of CuSO₄.5H₂O, 1.2 g of FeSO₄.7H₂O, 0.7 g ofMnSO₄.H₂O, 0.8 g of Na₂MoO₂.2H₂O, and 10 g of ZnSO₄.7H₂O.

Cellulase-inducing media was composed per liter of 20 g of ArbocelB800-natural cellulose fibers (J. Rettenmaier USA LP, Schoolcraft,Mich.), 10 g of corn steep solids (Sigma Chemical Co., St. Louis, Mo.),1.45 g of (NH₄)₂SO₄, 2.08 g of KH₂PO₄, 0.28 g of CaCl₂, 0.42 g ofMgSO₄.7H₂O, 0.42 ml Trichoderma reesei Trace Metals, and 2 drops ofpluronic acid; pH to 6.0 with 10 N NaoH.

Trichoderma reesei trace metals solution was composed per liter of 216 gof FeCl₃.6H₂O, 58 g of ZnSO₄.7H₂O, 27 g of MnSO₄.H₂O, 10 g ofCuSO₄.5H₂O, 2.4 g of H₃BO₃, and 336 g of citric acid.

PEG Buffer was composed per liter of 500 g of PEG 4000 (BDH, Poole,England), 10 mM CaCl₂, and 10 mM Tris-HCl pH 7.5 (filter sterilize).

STC was composed per liter of 1 M sorbitol, 10 mM CaCl₂, and 10 mMTris-HCl pH 7.5 (filter sterilize).

Inoculum Medium was composed per liter of 20 g of glucose, 10 g of cornsteep solids (Sigma Chemical Co., St. Louis, Mo.), 1.45 g of (NH₄)₂SO₄,2.08 g of KH₂PO₄, 0.28 g of CaCl₂, 0.42 g of MgSO₄.7H₂O, 0.42 ml ofTrichoderma reesei trace metals solution, and 2 drops of pluronic acid;final pH 5.0.

Fermentation Medium was composed per liter of 4 g of glucose, 10 g ofcorn steep solids, 30 g of Arbocel B800-natural cellulose fibers (J.Rettenmaier USA LP, Schoolcraft, Mich.), 3.8 g of (NH₄)₂SO₄, 2.8 g ofKH₂PO₄, 2.08 g of CaCl₂, 1.63 g of MgSO₄.7H₂O, 0.75 ml of Trichodermareesei trace metals solution, and 1.8 ml of pluronic acid.

Feed Medium was composed per liter of 600 g of glucose, 20 g ofCellulose B800, 35.5 g of H₃PO4, and 5 ml of pluronic acid.

Beta-glucosidase Activity Assay

For Trichoderma reesei samples, beta-glucosidase activity was determinedat ambient temperature using 25 μl aliquots of culture supernatants,diluted 1:10 in 50 mM succinate pH 5.0, using 200 μl of 0.5 mg/mlp-nitrophenyl-beta-D-glucopyranoside as substrate in 50 mM succinate pH5.0. After 15 minutes incubation the reaction was stopped by adding 100μl of 1 M Tris-HCl pH 8.0 and the absorbance was readspectrophotometrically at 405 nm.

For Saccharomyces cerevisiae samples, culture supernatant samples werediluted 0.6-fold with 0.1 M succinate pH 5.0 in 96-wells microtiterplates. Twenty five μl of the diluted samples were taken from each welland added to a new 96-well plate, containing 200 μl of 1 mg/mlp-nitrophenyl-beta-D-glucopyranoside substrate. The plates wereincubated at ambient temperature for 1.5 hours and the reaction stoppedby adding 2 M Tris-HCl pH 9. The plates were then readspectrophotometrically at 405 nm.

One unit of beta-glucosidase activity corresponded to production of 1μmol of p-nitrophenyl per minute per liter at pH 5.0, ambienttemperature. Aspergillus niger beta-glucosidase (Novozyme 188, NovozymesA/S, Bagsværd, Denmark) was used as an enzyme standard.

DNA Sequencing

DNA sequencing was performed on an ABI3700 (Applied Biosystems, FosterCity, Calif.) using dye terminator chemistry (Giesecke et al., 1992,Journal of Virol. Methods 38: 47-60). Sequences were assembled usingphred/phrap/consed (University of Washington, Seattle Wash.) withsequence specific primers.

Example 1 Construction of pAILo1 Expression Vector

Expression vector pAILo1 was constructed by modifying pBANe6 (U.S. Pat.No. 6,461,837), which comprises the NA2-tpi promoter, Aspergillus nigeramyloglucosidase terminator sequence (AMG terminator), and Aspergillusnidulans acetamidase gene (amdS). Modification of pBANe6 was performedby first eliminating three Nco I restriction sites at positions 2051,2722, and 3397 bp from the amdS selection marker by site directedmutagenesis. All changes were designed to be “silent” leaving the actualprotein sequence of the amdS gene product unchanged. Removal of thesethree sites was performed simultaneously with a GeneEditor Site-DirectedMutagenesis Kit (Promega, Madison, Wis.) according to the manufacturer'sinstructions using the following primers (underlined nucleotiderepresents the changed base):

AMDS3NcoMut (2050): 5′-GTGCCCCATGATACGCCTCCGG-3′ (SEQ ID NO: 1)AMDS2NcoMut (2721): 5′-GAGTCGTATTTCCAAGGCTCCTGACC-3′ (SEQ ID NO: 2)AMDS1NcoMut (3396): 5′-GGAGGCCATGAAGTGGACCAACGG-3′ (SEQ ID NO: 3)

A plasmid comprising all three expected sequence changes was thensubmitted to site-directed mutagenesis, using a QuickChange MutagenesisKit (Stratagene, La Jolla, Calif.), to eliminate the Nco I restrictionsite at the end of the AMG terminator at position 1643. The followingprimers (underlined nucleotide represents the changed base) were usedfor mutagenesis:

-   Upper Primer to mutagenize the Aspergillus niger amyloglucosidase    (AMG) terminator sequence:

(SEQ ID NO: 4) 5′-CACCGTGAAAGCCATGCTCTTTCCTTCGTGTAGAAGACCAGACAG-3′

-   Lower Primer to mutagenize the Aspergillus niger amyloglucosidase    (AMG) terminator sequence:

(SEQ ID NO: 5) 5′-CTGGTCTTCTACACGAAGGAAAGAGCATGGCTTTCACGGTGTCTG-3′

The last step in the modification of pBANe6 was the addition of a newNco I restriction site at the beginning of the polylinker using aQuickChange Mutagenesis Kit and the following primers (underlinednucleotides represent the changed bases) to yield pAILo1 (FIG. 1).

-   Upper Primer to mutagenize the Aspergillus niger amylase promoter    (NA2-tpi):

(SEQ ID NO: 6) 5′-CTATATACACAACTGGATTTACCATGGGCCCGCGGCCGCAGATC-3′

-   Lower Primer to mutagenize the Aspergillus niger amylase promoter    (NA2-tpi):

(SEQ ID NO: 7) 5′-GATCTGCGGCCGCGGGCCCATGGTAAATCCAGTTGTGTATATAG-3′

The amdS gene of pAILo1 was swapped with the Aspergillus nidulans pyrGgene. Plasmid pBANe10 (FIG. 14) was used as a source for the pyrG geneas a selection marker. Analysis of the sequence of pBANe10 showed thatthe pyrG marker was contained within an Nsi I restriction fragment anddoes not contain either Nco I or Pac I restriction sites. Since the amdSis also flanked by Nsi I restriction sites the strategy to switch theselection marker was a simple swap of Nsi I restriction fragments.Plasmid DNA from pAILo1 and pBANe10 were digested with the restrictionenzyme Nsi I and the products purified by agarose gel electrophoresis.The Nsi I fragment from pBANe10 containing the pyrG gene was ligated tothe backbone of pAILo1 to replace the original Nsi I DNA fragmentcontaining the amdS gene. Recombinant clones were analyzed byrestriction digest to determine that they had the correct insert andalso its orientation. A clone with the pyrG gene transcribed in thecounterclockwise direction was selected. The new plasmid has beendesignated pAILo2 (FIG. 15).

Example 2 Construction of pMJ04 Expression Vector

Expression vector pMJ04 was constructed by PCR amplifying theTrichoderma reesei exocellobiohydrolase 1 gene (cbh1) terminator fromTrichoderma reesei RutC30 genomic DNA using primers 993429 (antisense)and 993428 (sense) shown below. The antisense primer was engineered tohave a Pac I site at the 5′-end and a Spe I site at the 3′-end of thesense primer.

Primer 993429 (antisense): 5′-AACGTTAATTAAGGAATCGTTTTGTGTTT-3′(SEQ ID NO: 8) Primer 993428 (sense):5′-AGTACTAGTAGCTCCGTGGCGAAAGCCTG-3′ (SEQ ID NO: 9)

Trichoderma reesei RutC30 genomic DNA was isolated using a DNeasy PlantMaxi Kit (Qiagen, Chatsworth, Calif.).

The amplification reactions (50 μl) were composed of 1× ThermoPolReaction Buffer (New England Biolabs, Beverly, Mass.), 0.3 mM dNTPs, 100ng of Trichoderma reesei RutC30 genomic DNA, 0.3 μM primer 993429, 0.3μM primer 993428, and 2 units of Vent polymerase (New England Biolabs,Beverly, Mass.). The reactions were incubated in an EppendorfMastercycler 5333 (Eppendorf Scientific, Inc., Westbury, N.Y.)programmed as follows: 5 cycles each for 30 seconds at 94° C., 30seconds at 50° C., and 60 seconds at 72° C., followed by 25 cycles eachfor 30 seconds at 94° C., 30 seconds at 65° C., and 120 seconds at 72°C. (5 minute final extension). The reaction products were isolated on a1.0% agarose gel using 40 mM Tris base-20 mM sodium acetate-1 mMdisodium EDTA (TAE) buffer where a 229 bp product band was excised fromthe gel and purified using a QIAquick Gel Extraction Kit (QIAGEN,Chatsworth, Calif.) according to the manufacturer's instructions.

The resulting PCR fragment was digested with Pac I and Spe I and ligatedinto pAILo01 digested with the same restriction enzymes using a RapidLigation Kit (Roche, Indianapolis, Ind.), to generate pMJ04 (FIG. 2).

Example 3 Construction of pCaHj568 Expression Vector

Expression plasmid pCaHj568 was constructed from pCaHj170 (U.S. Pat. No.5,763,254) and pMT2188. Plasmid pCaHj170 comprises the Humicola insolensendoglucanase V (EGV) coding region. Plasmid pMT2188 was constructed asfollows: The pUC19 origin of replication was PCR amplified from pCaHj483(WO 98/00529) with primers 142779 and 142780 shown below. Primer 142780introduces a BbuI site in the PCR fragment.

(SEQ ID NO: 10) 142779: 5′-TTGAATTGAAAATAGATTGATTTAAAACTTC-3′(SEQ ID NO: 11) 142780: 5′-TTGCATGCGTAATCATGGTCATAGC-3′

The Expand PCR system (Roche Molecular Biochemicals, Basel, Switserland)was used for the amplification following the manufacturer's instructionsfor this and the subsequent PCR amplifications. PCR products wereseparated on a 1% agarose gel using TAE buffer and an 1160 bp fragmentwas isolated and purified using a Jetquick Gel Extraction Spin Kit(Genomed, Wielandstr, Germany).

The URA3 gene was amplified from the Saccharomyces cerevisae cloningvector pYES2 (Invitrogen, Carlsbad, Calif.) using primers 140288 and142778 below. Primer 140288 introduces an Eco RI site in the PCRfragment.

(SEQ ID NO: 12) 140288: 5′-TTGAATTCATGGGTAATAACTGATAT-3′ (SEQ ID NO: 13)142778: 5′-AAATCAATCTATTTTCAATTCAATTCATCATT-3′

PCR products were separated on a 1% agarose gel using TAE buffer and an1126 bp fragment was isolated and purified using a Jetquick GelExtraction Spin Kit.

The two PCR fragments were fused by mixing and amplification usingprimers 142780 and 140288 shown above by overlap method splicing (Hortonet al., 1989, Gene 77: 61-68). PCR products were separated on 1% agarosegel using TAE buffer and a 2263 bp fragment was isolated and purifiedusing a Jetquick Gel Extraction Spin Kit.

The resulting fragment was digested with Eco RI and Bbu I and ligated tothe largest fragment of pCaHj483 digested with the same enzymes. Theligation mixture was used to transform pyrF-negative E. coli strainDB6507 (ATCC 35673) made competent by the method of Mandel and Higa,1970, J. Mol. Biol. 45: 154. Transformants were selected on solid M9medium (Sambrook et al., 1989, Molecular Cloning, A Laboratory Manual,2nd edition, Cold Spring Harbor Laboratory Press) supplemented per literwith 1 g of casaminoacids, 500 μg of thiamine, and 10 mg of kanamycin. Aplasmid from one transformant was isolated and designated pCaHj527 (FIG.3).

The NA2/tpi promoter present on pCaHj527 was subjected to site directedmutagenesis by a simple PCR approach. Nucleotides 134-144 were convertedfrom GTACTAAAACC to CCGTTAAATTT using mutagenic primer 141223:

Primer 141223: (SEQ ID NO: 14)5′-GGATGCTGTTGACTCCGGAAATTTAACGGTTTGGTCTTGCATCCC-3′Nucleotides 423-436 were converted from ATGCAATTTAAACT to CGGCAATTTAACGGusing mutagenic primer 141222:

Primer 141222: (SEQ ID NO: 15)5′-GGTATTGTCCTGCAGACGGCAATTTAACGGCTTCTGCGAATCGC-3′

The resulting plasmid was designated pMT2188 (FIG. 4).

The Humicola insolens endoglucanase V coding region was transferred frompCaHj170 as a Bam HI-Sal I fragment into pMT2188 digested with Bam HIand Xho I to generate pCaHj568 (FIG. 5).

Example 4 Construction of pMJ05 Expression Vector

Expression vector pMJ05 was constructed by PCR amplifying the 915 bpHumicola insolens endoglucanase V coding region from pCaHj568 usingprimers HiEGV-F and HiEGV-R shown below.

HiEGV-F (sense): (SEQ ID NO: 16) 5′-AAGCTTAAGCATGCGTTCCTCCCCCCTCC-3′HiEGV-R (antisense): (SEQ ID NO: 17)5′-CTGCAGAATTCTACAGGCACTGATGGTACCAG-3′

The amplification reactions (50 μl) were composed of 1× ThermoPolReaction Buffer, 0.3 mM dNTPs, 10 ng/μl pCaHj568 plasmid, 0.3 μM HiEGV-Fprimer, 0.3 μM HiEGV-R primer, and 2 units of Vent polymerase. Thereactions were incubated in an Eppendorf Mastercycler 5333 programmed asfollows: 5 cycles each for 30 seconds at 94° C., 30 seconds at 50° C.,and 60 seconds at 72° C., followed by 25 cycles each for 30 seconds at94° C., 30 seconds at 65° C., and 120 seconds at 72° C. (5 minute finalextension). The reaction products were isolated on a 1.0% agarose gelusing TAE buffer where a 937 bp product band was excised from the geland purified using a QIAquick Gel Extraction Kit according to themanufacturer's instructions.

This 937 bp purified fragment was used as template DNA for subsequentamplifications using the following primers:

HiEGV-R (antisense): (SEQ ID NO: 18)5′-CTGCAGAATTCTACAGGCACTGATGGTACCAG-3′ HiEGV-F-overlap (sense):(SEQ ID NO: 19) 5′-ACCGCGGACTGCGCATC ATGCGTTCCTCCCCCCTCC-3′Primer sequences in italics are homologous to 17 bp of the Trichodermareesei cbh1 promoter and underlined primer sequences are homologous to29 bp of the Humicola insolens endoglucanase V coding region. The 36 bpoverlap between the promoter and the coding sequence allowed precisefusion of the 994 bp fragment comprising the Trichoderma reesei cbh1promoter to the 918 bp fragment comprising the Humicola insolensendoglucanase V open reading frame.

The amplification reactions (50 μl) were composed of 1× ThermoPolReaction Buffer, 0.3 mM dNTPs, 1 ul of 937 bp purified PCR fragment, 0.3μM HiEGV-F-overlap primer, 0.3 μM HiEGV-R primer, and 2 units of Ventpolymerase. The reactions were incubated in an Eppendorf Mastercycler5333 programmed as follows: 5 cycles each for 30 seconds at 94° C., 30seconds at 50° C., and 60 seconds at 72° C., followed by 25 cycles eachfor 30 seconds at 94° C., 30 seconds at 65° C., and 120 seconds at 72°C. (5 minute final extension). The reaction products were isolated on a1.0% agarose gel using TAE buffer where a 945 bp product band wasexcised from the gel and purified using a QIAquick Gel Extraction Kitaccording to the manufacturer's instructions.

A separate PCR was performed to amplify the Trichoderma reesei cbh1promoter sequence extending from 994 bp upstream of the ATG start codonof the gene from Trichoderma reesei RutC30 genomic DNA using thefollowing primers (sense primer was engineered to have a Sal Irestriction site at the 5′-end):

TrCBHIpro-F (sense): 5′-AAACGTCGACCGAATGTAGGATTGTTATC-3′ (SEQ ID NO: 20)TrCBHIpro-R (antisense): 5′-GATGCGCAGTCCGCGGT-3′ (SEQ ID NO: 21)

The amplification reactions (50 μl) were composed of 1× ThermoPolReaction Buffer, 0.3 mM dNTPs, 100 ng of Trichoderma reesei RutC30genomic DNA, 0.3 μM TrCBHIpro-F primer, 0.3 μM TrCBHIpro-R primer, and 2units of Vent polymerase. The reactions were incubated in an EppendorfMastercycler 5333 programmed as follows: 30 cycles each for 30 secondsat 94° C., 30 seconds at 55° C., and 120 seconds at 72° C. (5 minutefinal extension). The reaction products were isolated on a 1.0% agarosegel using TAE buffer where a 998 bp product band was excised from thegel and purified using a QIAquick Gel Extraction Kit according to themanufacturer's instructions.

The 998 bp purified PCR fragment was used to as template DNA forsubsequent amplifications using the following primers:

TrCBHIpro-F: (SEQ ID NO: 22) 5′-AAACGTCGACCGAATGTAGGATTGTTATC-3′TrCBHIpro-R-overlap: (SEQ ID NO: 23) 5′-GGAGGGGGGAGGAACGCATGATGCGCAGTCCGCGGT-3′

Sequences in italics are homologous to 17 bp of the Trichoderma reeseicbh1 promoter and underlined sequences are homologous to 29 bp of theHumicola insolens endoglucanase V coding region. The 36 bp overlapbetween the promoter and the coding sequence allowed precise fusion ofthe 994 bp fragment comprising the Trichoderma reesei cbh1 promoter tothe 918 bp fragment comprising the Humicola insolens endoglucanase Vopen reading frame.

The amplification reactions (50 μl) were composed of 1× ThermoPolReaction Buffer, 0.3 mM dNTPs, 1 μl of 998 bp purified PCR fragment, 0.3μM TrCBH1pro-F primer, 0.3 μM TrCBH1pro-R-overlap primer, and 2 units ofVent polymerase. The reactions were incubated in an EppendorfMastercycler 5333 programmed as follows: 5 cycles each for 30 seconds at94° C., 30 seconds at 50° C., and 60 seconds at 72° C., followed by 25cycles each for 30 seconds at 94° C., 30 seconds at 65° C., and 120seconds at 72° C. (5 minute final extension). The reaction products wereisolated on a 1.0% agarose gel using TAE buffer where a 1017 bp productband was excised from the gel and purified using a QIAquick GelExtraction Kit according to the manufacturer's instructions.

The 1017 bp Trichoderma reesei cbh1 promoter PCR fragment and the 945 bpHumicola insolens endoglucanase V PCR fragments were used as templateDNA for subsequent amplification using the following primers toprecisely fuse the 994 bp Trichoderma reesei cbh1 promoter to the 918 bpHumicola insolens endoglucanase V coding region using overlapping PCR.

TrCBHIpro-F: (SEQ ID NO: 24) 5′-AAACGTCGACCGAATGTAGGATTGTTATC-3′HiEGV-R: (SEQ ID NO: 25) 5′-CTGCAGAATTCTACAGGCACTGATGGTACCAG-3′

The amplification reactions (50 μl) were composed of 1× ThermoPolReaction Buffer, 0.3 mM dNTPs, 0.3 μM TrCBH1pro-F primer, 0.3 μM HiEGV-Rprimer, and 2 U of Vent polymerase.

The reactions were incubated in an Eppendorf Mastercycler 5333programmed as follows: 5 cycles each for 30 seconds at 94° C., 30seconds at 50° C., and 60 seconds at 72° C., followed by 25 cycles eachfor 30 seconds at 94° C., 30 seconds at 65° C., and 120 seconds at 72°C. (5 minute final extension). The reaction products were isolated on a1.0% agarose gel using TAE buffer where a 1926 bp product band wasexcised from the gel and purified using a QIAquick Gel Extraction Kitaccording to the manufacturer's instructions.

The resulting 1926 bp fragment was cloned into pCR-Blunt-II-TOPO vectorusing a Zero Blunt™ TOPO PCR Cloning Kit (Invitrogen, Carlsbad, Calif.)following the manufacturer's protocol. The resulting plasmid wasdigested with Not I and Sal I and the 1926 bp fragment purified andligated into pMJ04 expression vector which was also digested with thesame two restriction enzymes, to generate pMJ05 (FIG. 6).

Example 5 Construction of pSMai130 Expression Vector

A 2586 bp DNA fragment spanning from the ATG start codon to the TAA stopcodon of the Aspergillus oryzae beta-glucosidase coding sequence (SEQ IDNO: 42 for cDNA sequence and SEQ ID NO: 43 for the deduced amino acidsequence; E. coli DSM 14240) was amplified by PCR from pJaL660 (WO2002/095014) as template with primers 993467 (sense) and 993456(antisense) shown below. A Spe I site was engineered at the 5′ end ofthe antisense primer to facilitate ligation. Primer sequences in italicsare homologous to 24 bp of the Trichoderma reesei cbh1 promoter andunderlined sequences are homologous to 22 bp of the Aspergillus oryzaebeta-glucosidase coding region.

Primer 993467: (SEQ ID NO: 26) 5′-ATAGTCAACCGCGGACTGCGCATCATGAAGCTTGGTTGGATCGAGG- 3′ Primer 993456: (SEQ ID NO: 27)5′-ACTAGTTTACTGGGCCTTAGGCAGCG-3′

The amplification reactions (50 μl) were composed of Pfx AmplificationBuffer (Invitrogen, Carlsbad, Calif.), 0.25 mM dNTPs, 10 ng of pJaL660plasmid, 6.4 μM primer 993467, 3.2 μM primer 993456, 1 mM MgCl₂, and 2.5units of Pfx DNA polymerase (Invitrogen, Carlsbad, Calif.). Thereactions were incubated in an Eppendorf Mastercycler 5333 programmed asfollows: 30 cycles each for 60 seconds at 94° C., 60 seconds at 55° C.,and 180 seconds at 72° C. (15 minute final extension). The reactionproducts were isolated on a 1.0% agarose gel using TAE buffer where a2586 bp product band was excised from the gel and purified using aQIAquick Gel Extraction Kit according to the manufacturer'sinstructions.

A separate PCR was performed to amplify the Trichoderma reesei cbh1promoter sequence extending from 1000 bp upstream of the ATG start codonof the gene, using primer 993453 (sense) and primer 993463 (antisense)shown below to generate a 1000 bp PCR fragment. Primer sequences initalics are homologous to the 24 bp of the Trichoderma reesei cbh1promoter and underlined primer sequences are homologous to the 22 bp ofthe Aspergillus oryzae beta-glucosidase coding region. The 46 bp overlapbetween the promoter and the coding sequence allows precise fusion ofthe 1000 bp fragment comprising the Trichoderma reesei cbh1 promoter tothe 2586 bp fragment comprising the Aspergillus oryzae beta-glucosidaseopen reading frame.

Primer 993453: (SEQ ID NO: 28) 5′-GTCGACTCGAAGCCCGAATGTAGGAT-3′Primer 993463: (SEQ ID NO: 29) 5′-CCTCGATCCAACCAAGCTTCATGATGCGCAGTCCGCGGTTGACTA-3′

The amplification reactions (50 μl) were composed of Pfx AmplificationBuffer, 0.25 mM dNTPs, 100 ng of Trichoderma reesei RutC30 genomic DNA,6.4 μM primer 993453, 3.2 μM primer 993463, 1 mM MgCl₂, and 2.5 units ofPfx DNA polymerase. The reactions were incubated in an EppendorfMastercycler 5333 programmed as follows: 30 cycles each for 60 secondsat 94° C., 60 seconds at 55° C., and 180 seconds at 72° C. (15 minutefinal extension). The reaction products were isolated on a 1.0% agarosegel using TAE buffer where a 1000 bp product band was excised from thegel and purified using a QIAquick Gel Extraction Kit according to themanufacturer's instructions.

The purified fragments were used as template DNA for subsequentamplification using primer 993453 (sense) and primer 993456 (antisense)shown above to precisely fuse the 1000 bp fragment comprising theTrichoderma reesei cbh1 promoter to the 2586 bp fragment comprising theAspergillus oryzae beta-glucosidase open reading frame by overlappingPCR.

The amplification reactions (50 μl) were composed of Pfx AmplificationBuffer, 0.25 mM dNTPs, 6.4 μM primer 99353, 3.2 μM primer 993456, 1 mMMgCl₂, and 2.5 units of Pfx DNA polymerase. The reactions were incubatedin an Eppendorf Mastercycler 5333 programmed as follows: 30 cycles eachfor 60 seconds at 94° C., 60 seconds at 60° C., and 240 seconds at 72°C. (15 minute final extension).

The resulting 3586 bp fragment was digested with Sal I and Spe I andligated into pMJ04, digested with the same two restriction enzymes, togenerate pSMai130 (FIG. 7).

Example 6 Construction of pSMai135

The Aspergillus oryzae beta-glucosidase coding region (minus the nativesignal sequence, see FIG. 8) from Lys-20 to the TM stop codon was PCRamplified from pJaL660 as template with primer 993728 (sense) and primer993727 (antisense) shown below. Sequences in italics are homologous to20 bp of the Humicola insolens endoglucanase V signal sequence andsequences underlined are homologous to 22 bp of the Aspergillus oryzaebeta-glucosidase coding region. A Spe I site was engineered into the 5′end of the antisense primer.

Primer 993728: (SEQ ID NO: 30) 5′-TGCCGGTGTTGGCCCTTGCCAAGGATGATCTCGCGTACTCCC-3′ Primer 993727: (SEQ ID NO: 31)5′-GACTAGTCTTACTGGGCCTTAGGCAGCG-3′

The amplification reactions (50 μl) were composed of Pfx AmplificationBuffer, 0.25 mM dNTPs, 10 ng/μl pJal660, 6.4 μM primer 993728, 3.2 μMprimer 993727, 1 mM MgCl₂, and 2.5 units of Pfx DNA polymerase. Thereactions were incubated in an Eppendorf Mastercycler 5333 programmed asfollows: 30 cycles each for 60 seconds at 94° C., 60 seconds at 55° C.,and 180 seconds at 72° C. (15 minute final extension). The reactionproducts were isolated on a 1.0% agarose gel using TAE buffer where a2523 bp product band was excised from the gel and purified using aQIAquick Gel Extraction Kit according to the manufacturer'sinstructions.

A separate PCR amplification was performed to amplify 1000 bp of theTrichoderma reesei cbh1 promoter and 63 bp of the putative Humicolainsolens endoglucanase V signal sequence (ATG start codon to Ala-21,FIG. 9, SEQ ID NO: 36), using primer 993724 (sense) and primer 993729(antisense) shown below. Primer sequences in italics are homologous to20 by of the Humicola insolens endoglucanase V signal sequence andunderlined primer sequences are homologous to the 22 bp of theAspergillus oryzae beta-glucosidase coding region. Plasmid pMJ05, whichcomprises the Humicola insolens endoglucanase V coding region under thecontrol of the cbh1 promoter, was used as a template to generate a 1063bp fragment comprising the Trichoderma reesei cbh1 promoter/Humicolainsolens endoglucanase V signal sequence fragment. A 42 bp of overlapwas shared between the Trichoderma reesei cbh1 promoter/Humicolainsolens endoglucanase V signal sequence and the Aspergillus oryzaecoding sequence to provide a perfect linkage between the promoter andthe ATG start codon of the 2523 bp Aspergillus oryzae beta-glucosidase.

Primer 993724: (SEQ ID NO: 32) 5′-ACGCGTCGACCGAATGTAGGATTGTTATCC-3′Primer 993729: (SEQ ID NO: 33) 5′-GGGAGTACGCGAGATCATCCTTGGCAAGGGCCAACACCGGCA-3′

The amplification reactions (50 μl) were composed of Pfx AmplificationBuffer, 0.25 mM dNTPs, 10 ng/μl pMJ05, 6.4 μM primer 993728, 3.2 μMprimer 993727, 1 mM MgCl₂, and 2.5 units of Pfx DNA polymerase. Thereactions were incubated in an Eppendorf Mastercycler 5333 programmed asfollows: 30 cycles each for 60 seconds at 94° C., 60 seconds at 60° C.,and 240 seconds at 72° C. (15 minute final extension). The reactionproducts were isolated on a 1.0% agarose gel using TAE buffer where a1063 bp product band was excised from the gel and purified using aQIAquick Gel Extraction Kit according to the manufacturer'sinstructions.

The purified overlapping fragments were used as a template foramplification using primer 993724 (sense) and primer 993727 (antisense)described above to precisely fuse the 1063 bp fragment comprising theTrichoderma reesei cbh1 promoter/Humicola insolens endoglucanase Vsignal sequence to the 2523 bp fragment comprising the Aspergillusoryzae beta-glucosidase open reading frame by overlapping PCR.

The amplification reactions (50 μl) were composed of Pfx AmplificationBuffer, 0.25 mM dNTPs, 6.4 μM primer 993724, 3.2 μM primer 993727, 1 mMMgCl₂, and 2.5 units of Pfx DNA polymerase. The reactions were incubatedin an Eppendorf Mastercycler 5333 programmed as follows: 30 cycles eachfor 60 seconds at 94° C., 60 seconds at 60° C., and 240 seconds at 72°C. (15 minute final extension). The reaction products were isolated on a1.0% agarose gel using TAE buffer where a 3591 bp product band wasexcised from the gel and purified using a QIAquick Gel Extraction Kitaccording to the manufacturer's instructions.

The resulting 3591 bp fragment was digested with Sal I and Spe I andligated into pMJ04 digested with the same restriction enzymes togenerate pSMai135 (FIG. 10).

Example 7 Expression of Aspergillus oryzae Beta-Glucosidase ComparingNative and Heterologous Humicola insolens Endoglucanase V SecretionSignal in Trichoderma reesei

Plasmid pSMai130, in which the Aspergillus oryzae beta-glucosidase isexpressed from the cbh1 promoter and native secretion signal (FIG. 8,SEQ ID NOs: 34 (DNA sequence) and 35 (deduced amino acid sequence)), orpSMai135 encoding the mature Aspergillus oryzae beta-glucosidase enzymelinked to the Humicola insolens endoglucanase V secretion signal (FIG.9, SEQ ID NOs: 36 (DNA sequence) and 37 (deduced amino acid sequence)),was introduced into Trichoderma reesei RutC30 by PEG-mediatedtransformation (Penttila et al., 1987, supra). Both plasmids contain theAspergillus nidulans amdS gene to enable transformants to grow onacetamide as the sole nitrogen source.

Trichoderma reesei RutC30 was cultivated at 27° C. and 90 rpm in 25 mlof YP medium supplemented with 2% (w/v) glucose and 10 mM uridine for 17hours. Mycelia was collected by filtration using Millipore's VacuumDriven Disposable Filtration System (Millipore, Bedford, Mass.) andwashed twice with deionized water and twice with 1.2 M sorbitol.Protoplasts were generated by suspending the washed mycelia in 20 ml of1.2 M sorbitol containing 15 mg of Glucanex (Novozymes A/S, Bagsvrd,Denmark) per ml and 0.36 units of chitinase (Sigma Chemical Co., St.Louis, Mo.) per ml and incubating for 15-25 minutes at 34° C. withgentle shaking at 90 rpm. Protoplasts were collected by centrifuging for7 minutes at 400×g and washed twice with cold 1.2 M sorbitol. Theprotoplasts were counted using a haemacytometer and re-suspended in STCto a final concentration of 1×10⁸ protoplasts per ml. Excess protoplastswere stored in a Cryo 1° C. Freezing Container (Nalgene, Rochester,N.Y.) at −80° C.

Approximately 7 μg of Pme I digested expression plasmid (pSMai130 orpSMai135) was added to 100 μl of protoplast solution and mixed gently,followed by 260 μl of PEG buffer, mixed, and incubated at roomtemperature for 30 minutes. STC (3 ml) was then added, mixed and thetransformation solution was plated onto COVE plates using Aspergillusnidulans amdS selection. The plates were incubated at 28° C. for 5-7days. Transformants were sub-cultured onto COVE2 plates and grown at 28°C.

One hundred and ten amdS positive transformants were obtained withpSMai130 and 65 transformants with pSMai135. Twenty transformantsdesignated SMA130 obtained with pSMai130 (native secretion signal) and67 transformants designated SMA135 obtained with pSMai135 (heterologoussecretion signal) twere subcultured onto fresh plates containingacetamide and allowed to sporulate for 7 days at 28° C.

The 20 SMA130 and 67 SMA135 Trichoderma reesei transformants werecultivated in 125 ml baffled shake flasks containing 25 ml ofcellulase-inducing media at pH 6.0 inoculated with spores of thetransformants and incubated at 28° C. and 200 rpm for 7 days.Trichoderma reesei RutC30 was run as a control. Culture broth sampleswere removed at day 7. One ml of each culture broth was centrifuged at15,700×g for 5 minutes in a micro-centrifuge and the supernatantstransferred to new tubes. Samples were stored at 4° C. until enzymeassay. The supernatants were assayed for beta-glucosidase activity usingp-nitrophenyl-beta-D-glucopyranoside as substrate, as described above.

All 20 SMA130 transformants exhibited equivalent beta-glucosidaseactivity to that of the host strain, Trichoderma reesei RutC30. Incontrast, a number of SMA135 transformants showed beta-glucosidaseactivities several-fold more than that of Trichoderma reesei RutC30.Transformant SMA135-04 produced the highest beta-glucosidase activityhaving 7 times more beta-glucosidase activity than produced byTrichoderma reesei RutC30 as a control.

SDS-PAGE was carried out using Criterion Tris-HCl (5% resolving) gels(BioRad, Hercules, Calif.) with The Criterion System (BioRad, Hercules,Calif.). Five μl of day 7 supernatants (see above) were suspended in 2×concentration of Laemmli Sample Buffer (BioRad, Hercules, Calif.) andboiled in the presence of 5% beta-mercaptoethanol for 3 minutes. Thesupernatant samples were loaded onto a polyacrylamide gel and subjectedto electrophoresis with 1× Tris/Glycine/SDS as running buffer (BioRad,Hercules, Calif.). The resulting gel was stained with BioRad's Bio-SafeCoomassie Stain.

No beta-glucosidase protein was visible by SDS-PAGE for the Trichodermareesei SMA130 transformant culture broth supernatants. In contrast, 26of the 38 Trichoderma reesei SMA135 transformants produced a protein ofapproximately 110 kDa that was not visible in Trichoderma reesei RutC30as control. Transformant Trichoderma reesei SMA135-04 produced thehighest level of beta-glucosidase.

Example 8 Fermentation of Aspergillus oryzae SMA135-04

Fermentation was performed on Aspergillus oryzae SMA135-04 to determinethe production level of beta-glucosidase activity. Trichoderma reeseiRutC30 (host strain) was run as a control. Spores of Trichoderma reeseiSMA135-04 were inoculated into 500 ml shake flasks, containing 100 ml ofInoculum Medium. The flasks were placed into an orbital shaker at 28° C.for approximately 48 hours at which time 50 ml of the culture wasinoculated into 1.8 liters of Fermentation Medium (see above) in a 2liter fermentation vessel. The fermentations were run at a pH of 5.0,28° C., with minimum dissolved oxygen at a 25% at a 1.0 VVM air flow andan agitation of 1100. Feed Medium was administrated into thefermentation vessel at 18 hours with a feed rate of 3.6 g/hour for 33hours and then 7.2 g/hour. The fermentations ran for 165 hours at whichtime the final fermentation broths were centrifuged and the supernatantsstored at −20° C. until beta-glucosidase activity assay using theprocedure described earlier.

Beta-glucosidase activity on the Trichoderma reesei SMA135-04fermentation sample was determined to be approximately 8 times moreactive than that of Trichoderma reesei RutC30.

Example 9 Construction of pSATe111 and pALFd1 Saccharomyces cerevisiaeExpression Vectors

A 2,605 bp DNA fragment comprising the region from the ATG start codonto the TM stop codon of the Aspergillus oryzae beta-glucosidase codingsequence (SEQ ID NO: 42 for cDNA sequence and SEQ ID NO: 43 for thededuced amino acid sequence) was amplified by PCR from pJaL660 (WO2002/095014) as template with primers 992127 (sense) and 992328(antisense) shown below:

992127: (SEQ ID NO: 38) 5′-GCAGATCTACCATGAAGCTTGGTTGGATCGAG-3′ 992328:(SEQ ID NO: 39) 5′-GCCTCAGATTACTGGGCCTTAGGCAGCGAG-3′

-   Primer 992127 has an upstream Bgl II site and the primer 992328 has    a downstream Xho I site.

The amplification reactions (50 μl) were composed of 1× PCR buffercontaining MgCl₂ (Roche Applied Science, Manheim, Germany), 0.25 mMdNTPs, 50 μM primer 992127, 50 μM primer 992328, 80 ng of pJaL660, and2.5 units of Pwo DNA Polymerase (Roche Applied Science, Manheim,Germany). The reactions were incubated in an Eppendorf Mastercycler 5333programmed for 1 cycle at 94° C. for 5 minutes followed by 25 cycleseach at 94° C. for 60 seconds, 55° C. for 60 seconds, and 72° C. for 120seconds (10 minute final extension). The PCR product was then subclonedinto the pCR-Blunt II-TOPO vector using the ZeroBlunt™ TOPO PCR CloningKit (Invitrogen, Carlsbad, Calif.) following the manufacturer'sinstructions to generate plasmid pSATe101 (FIG. 11). Plasmid pSATe101was digested with Bgl II and Xho I to liberate the beta-glucosidasegene. The reaction products were isolated on a 1.0% agarose gel usingTAE buffer where a 2.6 kb product band was excised from the gel andpurified using a QIAquick Gel Extraction Kit according to themanufacturer's instructions.

The 2.6 kb PCR product was digested and cloned into Bam HI and Xho Isites of the copper inducible 2 μm yeast expression vector pCu426 (Labbeand Thiele, 1999, Methods Enzymol. 306: 145-53), to generate pSATe111(FIG. 12).

Plasmid pALFd1 was constructed to determine if enhanced Aspergillusoryzae beta-glucosidase production and secretion could also be achievedin Saccharomyces cerevisiae by swapping the native Aspergillus oryzaebeta-glucosidase secretion signal with the Humicola insolensendoglucanase V signal peptide.

Plasmid pSATe111 was digested with Xho I and Spe I to release 2.6 kb(Aspergillus oryzae beta-glucosidase) and 6 kb (rest of the vector)fragments. The 6 kb fragment was isolated and ligated to the 2.6 kb PCRfragment, containing the Aspergillus oryzae beta-glucosidase codingregion (minus the secretion signal sequence) and the Humicola insolensendoglucanase V signal sequence, which was amplified from pSMai135 usingprimers 993950 and 993951 shown below. The primers contain the Xho I andSpe I restriction sites at their ends for subsequent subcloning into theXho I and Spe I restriction sites of pSATe111.

Primer 993950: (SEQ ID NO: 40)5′-AATCCGACTAGTGGATCTACCATGCGTTCCTCCCCCCTCC-3′ Primer 993951:(SEQ ID NO: 41) 5′-GCGGGCCTCGAGTTACTGGGCCTTAGGCAGCG-3′

The amplification reactions (100 μl) were composed of PCR Thermo PolBuffer, 0.20 mM dNTPs, 0.14 μg of pSMai135 plasmid DNA, 50 μM primer993950, 50 μM primer 993951, and 2 units of Vent DNA polymerase. Thereactions were incubated in a RoboCycler Gradient 40 Thermal Cycler(Stratagene, La Jolla, Calif.) programmed as follows: one cycle of 1minute at 95° C., 25 cycles each for 1 minute at 95° C., 1 minute at 60or 64° C., and 3 minutes at 72° C. (10 minute final extension). Thereaction products were visualized on a 0.7% agarose gel using TAEbuffer. The resulting 2.6 kb fragment bands were purified using a PCRMinElute PCR Purification (QIAGEN, Chatsworth, Calif.) according to themanufacturer's instructions. The purified fragments were combined anddigested with Xho I and Spe I and ligated into pSATe111 digested withthe same two restriction enzymes to generate pALFd1 (FIG. 13).

Example 10 Expression of Aspergillus oryzaeBbetaGlucosidaseComparingNative and Heterologous Secretion Signal inSaccharomyces cerevisiae

Plasmid pALFd1 (approximately 600 ng) was transformed into freshly madeSaccharomyces cerevisiae YNG 318 competent cells according to theYEASTMAKER Yeast Transformation Protocol, CLONTECH Laboratories, Inc.,Palo Alto, Calif. Transformed cells were plated onto yeast selectionplates containing 0.15 mg of the chromogenic substrate5-bromo-4-chloro-3-indolyl-beta-D-glucopyranoside per ml, which yieldblue colonies when beta-glucosidase is present. The plates wereincubated at 30° C. for 4 days.

Colonies harboring the expression vector with the Humicola insolensendoglucanase V secretion signal were generally darker blue in colorthan the colonies that had the native Aspergillus oryzaebeta-glucosidase signal sequence, indicating that more Aspergillusoryzae beta-glucosidase was secreted using the Humicola insolensendoglucanase V secretion signal. Approximately, 242 blue colonies fromboth constructs were picked using an automated colony picker (QPix,Genetix USA, Inc., Boston, Mass.). The 242 transformants were inoculatedinto yeast selection medium (which contains copper) to induce expressionand secretion of Aspergillus oryzae beta-glucosidase. Broth from day 796-well culture was taken from each of the 245 colonies and assayed forbeta-glucosidase activity using p-nitrophenyl-beta-D-glucopyranoside assubstrate as described above. The results showed that coloniesexpressing beta-glucosidase with the heterologous signal sequence were6.6 times more active than the colonies that were transformed with theAspergillus oryzae beta-glucosidase with the native secretion signal.

Example 11 Identification of a Glycosyl Hydrolase Family GH3A Gene inthe Genomic Sequence of Aspergillus fumigatus

A tblastn search (Altschul et al., 1997, Nucleic Acids Res. 25:3389-3402) of the Aspergillus fumigatus partial genome sequence (TheInstitute for Genomic Research, Rockville, Md.) was carried out using asquery a beta-glucosidase protein sequence from Aspergillus aculeates(Accession No. P48825). Several genes were identified as putative FamilyGH3A homologs based upon a high degree of similarity to the querysequence at the amino acid level. One genomic region of approximately3000 bp with greater than 70% identity to the query sequence at theamino acid level was chosen for further study.

Example 12 Aspergillus fumigatus Genomic DNA Extraction

Aspergillus fumigatus PaHa34 was grown in 250 ml of potato dextrosemedium in a baffled shake flask at 37° C. and 240 rpm. Mycelia wereharvested by filtration, washed twice in TE buffer (10 mM Tris-1 mMEDTA), and frozen under liquid nitrogen. Frozen mycelia were ground bymortar and pestle to a fine powder, which was resuspended in pH 8.0buffer containing 10 mM Tris, 100 mM EDTA, 1% Triton X-100, 0.5 Mguanidine-HCl, and 200 mM NaCl. DNase-free RNase A was added at aconcentration of 20 μg/ml and the lysate was incubated at 37° C. for 30minutes. Cellular debris was removed by centrifugation, and DNA wasisolated by using a Qiagen Maxi 500 column (QIAGEN Inc., Chatsworth,Calif.). The columns were equilibrated in 10 ml of QBT washed with 30 mlof QC, and eluted with 15 ml of QF (all buffers from QIAGEN Inc.,Chatsworth, Calif.). DNA was precipitated in isopropanol, washed in 70%ethanol, and recovered by centrifugation. The DNA was resuspended in TEbuffer.

Example 13 Cloning of the Family GH3A Beta-Glucosidase Gene andConstruction of an Aspergillus oryzae Expression Vector

Two synthetic oligonucleotide primers shown below were designed to PCRamplify an Aspergillus fumigatus PaHa34 gene encoding a Family GH3Abeta-glucosidase from the genomic DNA prepared in Example 14. AnInFusion Cloning Kit (BD Biosciences, Palo Alto, Calif.) was used toclone the fragment directly into the expression vector, pAILo2 (FIG.14), without the need for restriction digests and ligation.

Forward primer: (SEQ ID NO: 44) 5′-ACTGGATTTACCATGAGATTCGGTTGGCTCG-3′Reverse primer: (SEQ ID NO: 45) 5′-AGTCACCTCTAGTTACTAGTAGACACGGGGC-3′Bold letters represent coding sequence. The remaining sequence ishomologous to the insertion sites of pAILo2, described in Example 7.

Fifty picomoles of each of the primers above were used in a PCR reactioncontaining 100 ng of Aspergillus fumigatus genomic DNA, 1× PfxAmplification Buffer, 1.5 μl of 10 mM blend of dATP, dTTP, dGTP, anddCTP, 2.5 units of Pfx DNA Polymerase, 1 μl of 50 mM MgSO₄ and 2.5 μl of10× pCRx Enhancer solution (Invitrogen, Carlsbad, Calif.) in a finalvolume of 50 μl. The reactions were incubated in an EppendorfMastercycler 5333 programmed as follows: one cycle at 94° C. for 2minutes; and 30 cycles each at 94° C. for 15 seconds, 55° C. for 30seconds, and 68° C. for 3 minutes. The heat block then went to a 4° C.soak cycle.

The reaction products were isolated on a 1.0% agarose gel using TAEbuffer where a 3 kb product band was excised from the gel and purifiedusing a QIAquick Gel Extraction Kit according to the manufacturer'sinstructions.

The fragment was then cloned into the pAILo2 expression vector using anInfusion Cloning Kit. The vector was digested with Nco I and Pac I. Thefragment was purified by gel electrophoresis and Qiaquick gelpurification. The gene fragment and digested vector were ligatedtogether in a reaction resulting in the expression plasmid pEJG97 (FIG.15) in which transcription of the Family GH3A beta-glucosidase gene wasunder the control of the NA2-tpi promoter. The ligation reaction (50 μl)was composed of 1× InFusion Buffer (BD Biosciences, Palo Alto, Calif.),1× BSA (BD Biosciences, Palo Alto, Calif.), 1 μl of Infusion enzyme(diluted 1:10) (BD Biosciences, Palo Alto, Calif.), 150 ng of pAILo2digested with Nco I and Pac I, and 50 ng of the Aspergillus fumigatusbeta-glucosidase purified PCR product. The reaction was incubated atroom temperature for 30 minutes. One μl of the reaction was used totransform E. coli XL10 Solopac Gold cells (Stratagene, La Jolla,Calif.). An E. coli transformant containing the pEJG97 plasmid wasdetected by restriction digestion of the plasmid DNA.

Example 14 Characterization of the Aspergillus fumigatus GenomicSequence Encoding a Family GH3A Beta-Glucosidase

DNA sequencing of the Aspergillus fumigatus beta-glucosidase gene frompEJG97 was performed as described previously using a primer walkingstrategy. A gene model for the Aspergillus fumigatus sequence wasconstructed based on similarity to homologous genes from Aspergillusaculeatus, Aspergillus niger, and Aspergillus kawachii. The nucleotidesequence (SEQ ID NO: 46) and deduced amino acid sequence (SEQ ID NO: 47)are shown in FIGS. 16A and 16B. The genomic fragment encodes apolypeptide of 863 amino acids, interrupted by 8 introns of 62, 55, 58,63, 58, 58, 63 and 51 bp. The %G+C content of the gene is 54.3%. Usingthe SignalP software program (Nielsen et al., 1997, Protein Engineering10: 1-6), a signal peptide of 19 residues was predicted. The predictedmature protein contains 844 amino acids with a molecular mass of 91.7kDa.

A comparative alignment of beta-glucosidase sequences was determinedusing the Clustal W method (Higgins, 1989, CABIOS 5: 151-153) using theLASERGENE™ MEGALIGN™ software (DNASTAR, Inc., Madison, Wis.) with anidentity table and the following multiple alignment parameters: Gappenalty of 10 and gap length penalty of 10. Pairwise alignmentparameters were Ktuple=1, gap penalty=3, windows=5, and diagonals=5. Thealignment showed that the deduced amino acid sequence of the Aspergillusfumigatus beta-glucosidase gene shares 78%, 76%, and 76% identity to thededuced amino acid sequences of the Aspergillus aculeatus (accessionnumber P48825), Aspergillus niger (accession number 000089), andAspergillus kawachii (accession number P87076) beta-glucosidases.

Example 15 Expression of the Aspergillus fumigatus Family GH3ABeta-Glucosidase Gene in Aspergillus oryzae JAL250

Aspergillus oryzae JaL250 protoplasts were prepared according to themethod of Christensen et al., 1988, Bio/Technology 6: 1419-1422. Five μgof pEJG97 (as well as pAILo2 as a vector control) was used to transformAspergillus oryzae JAL250.

The transformation of Aspergillus oryzae Jal250 with pEJG97 yieldedabout 100 transformants. Ten transformants were isolated to individualPDA plates.

Confluent PDA plates of five of the ten transformants were washed with 5ml of 0.01% Tween 20 and inoculated separately into 25 ml of MDU2BPmedium in 125 ml glass shake flasks and incubated at 34° C., 250 rpm.Five days after incubation, 0.5 μl of supernatant from each culture wasanalyzed using 8-16% Tris-Glycine SDS-PAGE gels (Invitrogen, Carlsbad,Calif.) according to the manufacturer's instructions. SDS-PAGE profilesof the cultures showed that one of the transformants (designatedtransformant 1) had a major band of approximately 130 kDa.

Example 16 Extraction of Total RNA from Aspergillus oryzae

The Aspergillus oryzae transformant described in Example 13 was frozenin liquid nitrogen and stored at −80° C. Subsequently, the frozen tissuewas ground in an electric coffee grinder with a few chips of dry iceadded to keep the powdered mycelia frozen. Then, the ground material wastransferred with a spatula to a 50 ml sterile conical tube which hadbeen previously filled with 20 ml of Fenozol (Active Motif, Inc.,Carlsbad, Calif.). The mixture was mixed rapidly to dissolve the frozenmaterial to a thick solution, and placed in a 50° C. water bath for 15minutes. Five ml of RNase-free chloroform was added to the mixture andvortexed vigorously. Then, the mixture was allowed to stand at roomtemperature for 10 minutes. Next the mixture was centrifuged at 1300×gin a Sorvall RT7 centrifuge (Sorvall, Inc, Newtown, Conn.) at roomtemperature for 20 minutes. The top phase was transferred to a newconical tube and an equal volume of phenol-chloroform-isoamylalcohol(25:24:1) was added. The mixture was vortexed and centrifuged for 10minutes. This procedure was repeated twice so that threephenol-chloroform isoamylalcohol extractions were done. Then, the topphase was transferred to a new tube and an equal volume ofchloroform:isoamylalcohol (24:1) was added. The mixture was vortexedonce again and centrifuged for 10 minutes. After centrifugation, theaqueous phase (approximately 5 ml) was transferred to a new Oak Ridgetube and 0.5 ml of 3 M sodium acetate pH 5.2 and 6.25 ml of isopropanolwere added. The mixture was mixed and incubated at room temperature for15 minutes. Subsequently, the mixture was centrifuged at 12,000×g for 30minutes, at 4° C. in a Sorvall RCSB (Sorvall, Inc, Newtown, Conn.).Following centrifugation, the supernatant was removed and 18 ml of 70%ethanol was carefully added to the pellet. Another centrifugation stepwas done for 10 minutes at 4° C. at 12,000×g. The supernatant wascarefully removed and the pellet was air dried. The RNA pellet wasresuspended in 500 μl of diethyl pyrocarbonate (DEPC)-treated water.Heating at 65° C. for 10 minutes aided in resuspension. The total RNAwas stored at −80° C. Quantitation and assessing RNA quality was done onan Agilent Bioanalyzer 2100 (Englewood, Colo.) using RNA chips. All thematerials and reagents used in this protocol were RNAse-free.

Example 17 Cloning of the Aspergillus fumigatus Beta-Glucosidase cDNASequence

The total RNA described in Example 16 was used to clone the Aspergillusfumigatus beta-glucosidase cDNA sequence (SEQ ID NO: 48 for cDNAsequence and SEQ ID NO: 49 for the deduced amino acid sequence). ThemRNA from the total RNA was purified using a Poly(A)Purist Mag Kit(Ambion, Inc., Austin, Tex.) following the manufacturer's instructions.The Aspergillus fumigatus beta-glucosidase cDNA sequence, was thenamplified in two fragments: a 1,337 bp DNA fragment spanning from theATG start codon to the 1,332 position (labeled as 5′ fragment) and asecond 1,300 bp DNA fragment (labeled 3′ fragment) spanning from the1,303 position until the stop codon using the ProStar UltraHF RT-PCRSystem (Stratagene, La Jolla, Calif.), following the manufacturer'sprotocol for a 50 μl reaction using 200 ng of poly-A mRNA with primesAfuma (sense) and Afumc (antisense) for the 5′ fragment and primersAfumd (sense) and Afumb (antisense) for the 3′ fragment as shown below:

(SEQ ID NO: 50) Afuma: 5′-GGCTCATGAGATTCGGTTGGCTCGAGGTC-3′(SEQ ID NO: 51) Afumc: 5′-GCCGTTATCACAGCCGCGGTCGGGGCAGCC-3′(SEQ ID NO: 52) Afumd: 5′-GGCTGCCCCGACCGCGGCTGTGATAACGGC-3′(SEQ ID NO: 53) Afumb: 5′-GCTTAATTAATCTAGTAGACACGGGGCAGAGGCGC-3′Primer Afuma has an upstream Bsp HI site and primer Afumb has adownstream Pac I site. Twenty nine nucleotides at the 3′-end of the1,337 fragment overlapped with the 5′-end of the 1,303 fragment. In theoverlap region there was a unique Sac II site.

Both fragments were subcloned individually into the pCR-BluntII-TOPOvector using a Zero Blunt™ TOPO PCR Cloning Kit for sequencing,following the manufacturer's protocol, generating plasmidspCR4Blunt-TOPOAfcDNA5′ (FIG. 17) and pCR4Blunt-TOPOAfcDNA3′ (FIG. 18),containing the 5′ and 3′ fragments, respectively.

The entire coding region of both Aspergillus fumigatus beta-glucosidasefragments was confirmed by sequencing using 0.5 μl of each plasmid DNAand 3.2 pmol of the following primers:

(SEQ ID NO: 54) BGLU1.for: 5′-ACACTGGCGGAGAAGG-3′ (SEQ ID NO: 55)BGLU2.for: 5′-GCCCAGGGATATGGTTAC-3′ (SEQ ID NO: 56)BGLU3.for: 5′-CGACTCTGGAGAGGGTTTC-3′ (SEQ ID NO: 57)BGLU4.rev: 5′-GGACTGGGTCATCACAAAG-3′ (SEQ ID NO: 58)BGLU5.rev: 5′-GCGAGAGGTCATCAGCA-3′ (SEQ ID NO: 59)M13 forward: 5′-GTAAAACGACGGCCAGT-3′ (SEQ ID NO: 60)M13 reverse: 5′-CAGGAAACAGCTATGA-3′

Sequencing results indicated the presence of several nucleotide changeswhen comparing the Aspergillus fumigatus beta-glucosidase cDNA sequenceobtained to the Aspergillus fumigatus beta-glucosidase cDNA sequencededuced from genome data of The Institute for Genomic Research(Rockville, Md.). At position 500, T was replaced by C, so that thecoding sequence GTT was changed to GCT, so that valine was replaced byalanine. At position 903, T was replaced by C, so that the codingsequence CCC was changed to CCT, however, this change was silent. Atposition 2,191, G was replaced by C, so that the coding sequence CAG waschanged to GAG, so that glutamic acid was replaced by glutamine.Finally, at position 2,368, C was replaced by T, so that the codingsequence CTG was changed to TTG, however, this change was also silent.

Once the two fragments had been sequenced, both clones containing eachfragment were digested using approximately 9 μg of each plasmid DNA withSac II and Pme I. Digestion of the pCR4Blunt-TOPOAfcDNA5′ vector withthe above enzymes generated a fragment of 3,956 bp (containing most ofthe vector) and a second fragment of and 1,339 bp (containing theAspergillus fumigatus beta-glucosidase cDNA 5′ fragment). Digestion ofthe pCR4Blunt-TOPOAfcDNA3′ vector with these same enzymes generated a5,227 bp fragment (containing most of the pCR4Blunt-TOPO vector and theAspergillus fumigatus beta-glucosidase cDNA 3′ fragment) and a secondfragment of 31 bp.

Digested pCR4Blunt-TOPOAfcDNA3′ was treated with shrimp alkalinephosphatase for dephosphorylation of the digested DNA products by adding1× SAP buffer and 1 μl of shrimp alkaline phosphatase (Roche AppliedScience, Manheim, Germany) and incubating the reaction for 10 minutes at37° C. followed by incubation at 85° C. for 10 minutes for enzymeinactivation. Both digestions were run on 0.7% agarose gel with TAEbuffer and purified using a QIAGEN Gel Purification Kit according to themanufacturer's instructions. The 1,339 bp band generated from thepCR4Blunt-TOPOAfcDNA5′ digestion and the 5,527 bp fragment generatedfrom the pCR4Blunt-TOPOAfcDNA3′ digestion were ligated by using theRapid DNA Ligation Kit (Roche Applied Science, Manheim, Germany)following the manufacturer's instructions. The ligation reaction wastransformed into XL1-Blue E. coli subcloning-competetent cells accordingto the manufacturer's instructions (Stratagene, La Jolla, Calif.). Upontransformation, plasmid DNA from an isolated colony was sequenced toconfirm that both the 5′ and 3′ fragments of the Aspergillus fumigatusbeta-glucosidase cDNA were subcloned in tandem generating a 6,566 bppCR4Blunt-TOPOAfcDNA vector (FIG. 19).

Example 18 Construction of the pALFd6 and pALFd7 Sacharomyces cerevisiaeExpression Vectors

The Aspergillus fumigatus beta-glucosidase full length cDNA wasamplified by PCR using the following primers that have homology topCu426 and the 5′ and 3′ sequences of the Aspergillus fumigatusbeta-glucosidase cDNA:

AfumigatusBGUpper:

(SEQ ID NO: 61) 5′-CTTCTTGTTAGTGCAATATCATATAGAAGTCATCGACTAGTGGATCTACCATGAGATTCGGTTGGCTCG-3′ATGAGATTCGGTTGGCTCG has homology to the 5′ end of the Aspergillusfumigatus cDNA

AfumigatusBGLower:

(SEQ ID NO: 62) 5′-GCGTGAATGTAAGCGTGACATAACTAATTACATGACTCGAGCTAGTAGACACGGGGCAGAG-3′CTAGTAGACACGGGGCAGAG has homology to the 3′ end of the Aspergillusfumigatus cDNA

The amplification reaction (100 μl) was composed of 0.5 μl of thepCR4Blunt-TOPOAfcDNA plasmid containing the Aspergillus fumigatus cDNAsequence, 1× Pfx Amplification Buffer, 50 μM each of dATP, dCTP, dGTP,and dTTP, 50 pmole of each above primer, 1.5 mM MgSO₄, and 2.5 units ofPlatinum Pfx DNA polymerase. The reactions were incubated in anRoboCycler Gradient 40 programmed for 1 cycle at 95° C. for 5 minutes;25 cycles each at 95° C. for 1 minute, 50° C. for 1 minute; and 72° C.for 3 minutes; and a final extension cycle at 72° C. for 10 minutes. ThePCR reaction was purified using a QIAquick PCR Purification Kit (QIAGENInc., Valencia, Calif.). DNA was eluted into 30 μl of EB buffer (QIAGENInc., Valencia, Calif.). The PCR product comprised 37 bp of homologousDNA sequence which was mixed with 1 μl of pCU426 gapped with Spe I andXho I for cotransformation into Saccharomyces cerevisiae YNG318competent cells as described in Example 10. These colonies did not turnblue, suggesting some sequencing error in the Aspergillus fumigatusbeta-glucosidase cDNA sequence. Further sequencing of the Aspergillusfumigatus cDNA sequence indicated an insertion of an extra nucleotide inthe cDNA sequence, which disrupted the open-reading frame of the enzyme.Therefore, this construct had to be fixed.

Simultaneously to expressing the Aspergillus fumigatus beta-glucosidasecDNA in Saccharomyces cerevisiae, the Humicola insolens endoglucanase Vsignal sequence was swapped with the native signal sequence of theAspergillus fumigatus cDNA sequence also for expression in Saccharomycescerevisiae to compare the expression of the enzymes with both signalsequences. The Aspergillus fumigatus cDNA sequence was amplified by PCRwith a primer that has homology to the Humicola insolens endoglucanase Vsignal sequence in pALFd1 as well as homology to the 5′-end of themature Aspergillus fumigatus beta-glucosidase cDNA sequence. The primersused for amplification of the Aspergillus fumigatus beta-glucosidasecDNA sequence are the AfumigatusBGLower primer described before and theHiEGVAfumigatus primer described below:

HiEGVAfumigatus:

(SEQ ID NO: 63) 5′-CCGCTCCGCCGTTGTGGCCGCCCTGCCGGTGTTGGCCCTTGCCGAATTGGCTTTCTCTCC-3′GAATTGGCTTTCTCTCC has homology to the 5′ end of the Aspergillusfumigatus mature sequence.

The amplification reaction (100 μl) was composed of 0.5 μl of thepCR4Blunt-TOPOAfcDNA plasmid containing the Aspergillus fumigatus cDNAsequence, 1× Pfx Amplification Buffer, 50 μM each of dATP, dCTP, dGTP,and dTTP, 50 pmole of each above primer, 1.5 mM MgSO₄, and 2.5 units ofPlatinum Pfx DNA polymerase. The reactions were incubated in anRoboCycler Gradient 40 programmed for 1 cycle at 95° C. for 5 minutes;25 cycles each at 95° C. for 1 minute, 50° C. for 1 minute; and 72° C.for 3 minutes; and a final extension cycle at 72° C. for 10 minutes. ThePCR reaction was purified using a QIAquick PCR Purification Kit. DNA waseluted into 10 μl of EB buffer. Three ul of the clean-up PCR product wasmixed with 1.8 μl of pALFd1 gapped with Eco NI and Xho I forcotransformation into Saccharomyces cerevisiae YNG318 competent cells asdescribed in Example 10. These colonies turned light blue. However, onecolony stood out by being very blue. DNA rescue from this colony wasdone according to the protocol described by Kaiser and Auer, 1993,BioTechniques 14: 552, except 20 μl of yeast lysis buffer (1% SDS, 10 mMTris-HCl, 1 mM EDTA pH 8) was used, and the plasmid was transformed intoE. coli SURE electroporation-competent cells (Stratagene, La Jolla,Calif.) for sequencing. Full-length sequencing indicated the Aspergillusfumigatus beta-glucosidase cDNA sequence was correct. This plasmid wasdesignated pALFd7 (FIG. 20), which comprised the Aspergillus fumigatusbeta-glucosidase cDNA sequence with the Humicola insolens endoglucanaseV signal sequence for yeast expression.

To produce a yeast expression vector containing the correct Aspergillusfumigatus cDNA sequence with its native signal sequence, the regioncontaining the correct nucleotide sequence from the yeast expressionvector containing the Aspergillus fumigatus cDNA sequence with theHumicola insolens endoglucanase V signal sequence (pALFd7) was amplifiedby PCR using the above BGLU.5rev primer and the following primer:

BGL.7for: 5′-CTGGCGTTGGCGCTGTC-3′ (SEQ ID NO: 64)

The amplification reaction (100 μl) was composed of 0.5 μl of pALFd7, 1×Pfx Amplification Buffer, 50 μM each of dATP, dCTP, dGTP, and dTTP, 50pmole of each above primer, 1.5 mM MgSO₄, and 2.5 units of Platinum PfxDNA polymerase. The reactions were incubated in an RoboCycler Gradient40 programmed for 1 cycle at 95° C. for 5 minutes; 25 cycles each at 95°C. for 1 minute, 50° C. for 1 minute; and 72° C. for 1 minutes; and afinal extension cycle at 72° C. for 10 minutes.

The 701 bp PCR fragment was purified using a QIAquick PCR PurificationKit. DNA was eluted into 10 μl of EB buffer. Three ul of the clean-upPCR product was mixed with 3 μl of the yeast expression vectorcontaining the Aspergillus fumigatus cDNA sequence with the nativesignal sequence and the extra nucleotide gapped with the Sac II and XmaI vector for cotransformation into Saccharomyces cerevisiae YNG318competent cells as described as described in Example 10. These coloniesturned blue. DNA rescue from one randomly picked blue colony was done asabove, the plasmid was transformed into E. coli SUREelectroporation-competent cells (Stratagene, La Jolla, Calif.) forsequencing. Full-length sequencing indicated the Aspergillus fumigatusbeta-glucosidase cDNA sequence was correct. This yeast expression vectorwas designated pALFd6 (FIG. 21), which comprised the Aspergillusfumigatus cDNA sequence with its native signal sequence.

Example 19 Expression of Aspergillus fumigatus Beta-GlucosidaseComparing Native and Heterologous Secretion Signal in Saccharomycescerevisiae

Plasmids pALFd6 (containing the Aspergillus fumigatus with its nativesignal sequence) and pALFd7 (containing the Aspergillus fumigatus withthe heterologous signal sequence), approximately 1 μg, were individuallytransformed into freshly made Saccharomyces cerevisiae YNG318 competentcells, plated onto yeast selection plates, and were incubated at 30° C.for 4 days as described in Example 10.

Two blue colonies from both constructs were picked manually andinoculated into yeast selection medium (which contains copper) to induceexpression and secretion of Aspergillus oryzae beta-glucosidase. Brothfrom day 5 was then assayed in duplicate for beta-glucosidase activityusing p-nitrophenyl-beta-D-glucopyranoside as substrate as describedabove. Cultures expressing beta-glucosidase with the heterologous signalsequence produced 2.5-fold more beta-glucosidase than culturesexpressing beta-glucosidase with its native signal sequence.

Deposit of Biological Material

The following biological material has been deposited under the terms ofthe Budapest Treaty with the Agricultural Research Service PatentCulture Collection, Northern Regional Research Center, 1815 UniversityStreet, Peoria, Ill., 61604, and given the following accession number:

Deposit Accession Number Date of Deposit E. coli TOP10 (pEJG113) NRRLB-30695 Oct. 17, 2003

The strain has been deposited under conditions that assure that accessto the culture will be available during the pendency of this patentapplication to one determined by the Commissioner of Patents andTrademarks to be entitled thereto under 37 C.F.R. §1.14 and 35 U.S.C.§122. The deposit represents a substantially pure culture of thedeposited strain. The deposit is available as required by foreign patentlaws in countries wherein counterparts of the subject application, orits progeny are filed. However, it should be understood that theavailability of a deposit does not constitute a license to practice thesubject invention in derogation of patent rights granted by governmentalaction.

The invention described and claimed herein is not to be limited in scopeby the specific aspects herein disclosed, since these aspects areintended as illustrations of several aspects of the invention. Anyequivalent aspects are intended to be within the scope of thisinvention. Indeed, various modifications of the invention in addition tothose shown and described herein will become apparent to those skilledin the art from the foregoing description. Such modifications are alsointended to fall within the scope of the appended claims. In the case ofconflict, the present disclosure including definitions will control.

Various references are cited herein, the disclosures of which areincorporated by reference in their entireties.

What is claimed is:
 1. A nucleic acid construct comprising a first polynucleotide comprising a nucleotide sequence encoding a signal peptide operably linked to a second polynucleotide comprising a nucleotide sequence encoding a polypeptide, wherein the first polynucleotide encoding the signal peptide is foreign to the second polynucleotide encoding the polypeptide, and the 3′ end of the first polynucleotide encoding the signal peptide is immediately upstream of the initiator codon of the second polynucleotide encoding the polypeptide; wherein the nucleotide sequence encoding the signal peptide is: (a) a nucleotide sequence encoding a signal peptide comprising an amino acid sequence having at least 90% sequence identity to SEQ ID NO: 37; or (b) a nucleotide sequence encoding a signal peptide comprising a sequence having at least 90% sequence identity to SEQ ID NO:
 36. 2. The nucleic acid construct of claim 1, wherein the nucleotide sequence encoding a signal peptide comprises an amino acid sequence having at least 95% sequence identity to SEQ ID NO:
 37. 3. The nucleic acid construct of claim 2, wherein the nucleotide sequence encoding a signal peptide comprises an amino acid sequence having at least 97% sequence identity to SEQ ID NO:
 37. 4. The nucleic acid construct of claim 1, wherein the nucleotide sequence encoding a signal peptide comprises a sequence having at least 95% sequence identity to SEQ ID NO:
 36. 5. The nucleic acid construct of claim 4, wherein the nucleotide sequence encoding a signal peptide comprises a sequence having at least 97% sequence identity to SEQ ID NO:
 36. 6. A recombinant expression vector comprising the nucleic acid construct of claim
 1. 7. A recombinant host cell comprising the nucleic acid construct of claim
 1. 